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ON ABBREVIATED WECHSLER-BELLEVUE SCALES 


QUINN McNEMAR 


STANFORD UNIVERSITY 


HE recent appearance of eight articles on 
abbreviated forms of the Wechsler-Bell- 

evue Scale indicates that test users are interest- 
ed not only in shorter tests but also in their 
“validities” as judged by the degree of correla- 
tion between various short scales and the full 
Bellevue Scale. The quest for these “validities” 
has involved the use of atypical samples: pris- 
oners, hospital patients, mentally retarded, 
high school students, student nurses, subnor- 
mals, clinic and guidance groups. Obviously, 
valid “validities’” should be based on samples 
which are neither too homogeneous nor too 
heterogeneous—the coefficients should be nei- 
ther spuriously low nor spuriously high. 

Satisfactory “‘validity” coefficients could be 
determined from Wechsler’s unpublished 
scores for his standardization groups by the 
laborious process of summing subtest scores for 
various combinations, then correlating these 
scores with the total scores based on all ten 
subtests. Fortunately, it is unnecessary to pro- 
ceed by such a hammer-and-tongs method. 

A general formula for determining the cor- 
relation between the total score, JT, on n tests 
with the sum 


SSA. tA, tr... 7 X,+...+ mM, 


of & tests drawn from among the n tests may 
be written as 
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The needed intercorrelations among the 
ten subtests are given by Wechsler [9, Table 
41] for 355 cases, ages 20-34, of his norm 
group. The required o’s are known since the 
raw subtest scores are transformed to weighted 
scores with standard deviation of 3, a value 
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which is fairly well approximated for age 
groups 20-24, 25-29, and 30-34 [9, Tables 
39-40]. For the Wechsler-Bellevue situation, 
with o’s for all tests equal to 3 and the sum of 
the intercorrelations calculable from Wechs- 
ler’s Table 41, it is easy to ascertain that the 
value of the first radical of the general form- 
ula becomes 21.72, and then the expression 
simplifies to 


+ SS, 
ie = x 
7.26Vk + 23rx 


The meaning of this will be clarified if we 
consider the correlation between total score 
and the sum of scores on tests a, 6, and c, and 


write the formula as 


3 + 2ray + Sry) + Zr; 


Bog SE eects 


7.263 + 2( ray + fac + Tre) 


Note that we simply need the sum of the r’s 
for each selected test with every other test (the 
sum of columns in the correlation matrix for 
the ten tests) and that we need the sum of 
the intercorrelations among the selected tests. 

By use of the formula, we can easily deter- 
mine the correlation between the sum of scores 
on any combination of subtests and the total 
score. Such a correlation will be very nearly 
that which would be obtained if computed di- 
rectly from the actual scores for the 355 cases 
of the standardization group. We estimate 
that the margin of error is of the order .005 
or less. 

The total number of possible combinations 
is fairly large: 45 teams of two tests, 120 
teams of three, 210 of four, 252 of five, and 
so on. Presumably, the users of abbreviated 
scales will be most interested in those com- 
binations which correlate highest with the to- 
tal score. Accordingly, we present the ten best 
teams of two, of three, of four, and of five 
tests. 
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TABLE 1 
CORRELATIONS FOR Best PAirs 
eR RR RS RT ee 881 
Blks., Sim. -.. Cie ES, Sy | 
OR Ee 
aT I css enn soniasnicmencecticnpentbnetee 853 
a ee Se 
a 844 
RSTO, sscesinslinnitigi naieasdisnslienivsirteedpsammiiiaahesiais 844 
PL ee eR Ce 841 
Arith., Dig. Sym. .............. titi a 


The correlations or “validities” for the best 
pairs of tests are given in Table 1. Digit Span 
plus Picture Arrangement, the pair proposed 
by Gurvitz [3], does not appear in the list be- 
cause it correlates only .770, a value which is 
near .741, the correlation for the worst pair 
(Dig. Sp., Objects). The abbreviated scale 
(Comp., Arith.) suggested by Cummings [1] 
has a validity of .833. 

TABLE 2 


CORRELATIONS FOR Best TRIADS 


Sg Deeg UE, NOI ioscan att ie 912 
| ee, ee a Ce .912 
Blks., Dig. Svm., Sim. shiibininibi 911 
bo a Rar, ACL .910 
Pic. Compl., Dig. Sym., Sim. 0... .907 
OR Er 
oS ee a 
eet ee eee at .903 
Comp., Blks., Sim. . Re ee Re ee 
Dig. Sp., Pic. Compl., Sim. —............... er 





Table 2 contains the correlations for the 
best combinations of three tests. It will be 
noted that the CAS (Comp., Arith., Sim.) 
scale is not among the ten best. As a matter of 
fact, this much used scale is surpassed by near- 
ly four dozen other combinations of three tests. 
For the standardization data, the CAS scale 
correlates only .864 with the total score, a val- 
ue which is far below the .956 given by Rabin 
[7] who first suggested the CAS combination. 
Rabin’s value is spuriously high because of the 
heterogeneity of his sample, which had a vari- 
ance 60 per cent larger than that of Wechsler’s 
norm group. In terms of the error of estimate 
accompanying predictions of total scores, the 
best triad of Table 2 is about 20 per cent better 
than the CAS scale. 


The best combinations of four tests, along 


TABLE 3 
CORRELATIONS FOR Best QUARTETS 


Comp., Arith., Blks., Dig. Sym. . .932 
Comp., Blks., Dig. Sym., Sim. ... .929 
Inf., Blks., Dig. Svm., Sim. .928 
Comp., Arith., Pic. Compl., Dig. Sym. —.......... .928 
Arith., Pic. Compl., Dig. Sym., Sim, ................ .928 
A, GRE. TOP, TO TIO cocccccceseserctessvensens 928 
Dig. Sp., Pic. Compl., Blks., Sim. . 
Pic. Compl., Blks., Dig. Sym., Sim. .. — 
Comp., Pic. Compl., Blks., Dig. Sym. -......... i aa 
Arith., Blks., Dig. Sym., Sim. Weare . .926 


with their “validities,” are set forth in Table 
3. The quartet (Inf., Arith., Blks., Sim.) pro- 
posed by Geil [2] is not among the ten best. 
Its correlation is .924 which is considerably 
below the .966 reported by Geil for a group 
even more heterogeneous than that of Rabin. 
Nor do the two combinations of four suggested 
by Patterson [5] appear among the best. His 
reported correlations of .936 (for Comp., 
Arith., Blks., Sim.) and .948 (for Inf., Pic. 
Arr., Pic. Compl., Dig. Sym.) are also spuri- 
ous because of heterogeneity. For the standard- 
ization group, these yield r’s of .921 and .909, 
respectively. 


TABLE 4 


CORKELATIONS FOR BesT QUINTETS 

Comp., Arith., Pic. Compl., Blks., Dig. Sym. .. .944 
Comp., Pic. Compl., Blks., Dig. Sym., Sim. 942 
Arith., Pic. Compl., Blks., Dig. Sym., Sim... .942 
Inf., Pic. Compl., Blks., Dig. Sym., Sim. 942 
Inf., Dig. Sp., Pic. Compl., Blks., Sim. ————- 941 
Comp., Inf., Pic. Compl., Blks., Dig. Sym. —... .940 
Comp., Arith., Pic. Arr., Blks., Dig. Sym. .940 
Comp., Dig. Sp., Pic. Compl., Blks., Sim... .939 
Comp., Arith., Pic. Arr., Pic. Compl., 

ae sainicaniseisesistchietbeciinatinninesctncti = oo 
Comp., Arith., Blks., Dig. Sym., Sim. ———__ 939 


Correlations for the best ten teams of five 
tests are presented in Table 4. We have found 
only one suggestion for, and use of, a scale of 
five tests, that of Hunt et al. [4] who obtained 
an r of .96 for the combination: Comp., Arith., 
Dig. Sp., Pic. Arr., Sim., for 46 high school 
students. Hunt et al. succeeded in picking one 
of the mediocre combinations of five. Their 
correlation of .96, if dependable, would lead 
one to think that this combination is much 
better than it actually is as judged by an r of 
.916 for Wechsler’s 355 cases. The coefficient 
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; on 
cas a total of 52Z),)55 


of alienation for their reported r is .28 as con- scores for the 355 

trasted with .40 for an r of .916. test scores would have been summed to obtain 
a total of 88,395 abbreviated scores. Our 

COMMENTS met! od not only avoided tl is ted ur but al Oo 


a. We arbitrarily chose to report on the the job o! calculation by ome omit 


ten best combinations—when the combinations version of the product moment coefficient. 
are arranged by descending ‘“‘validities.” the Once the columns of the matrix of intercor 
drop is gradual. The r’s drop to the following relations have been summed, the “val 


minimum values: for teams of two. to .741: for a combination of, e.g., four test n be a 


for teams of three, to .831; for teams of four, certained in about five minutes. 
— , ee fee f five. to .QRé 
ot 74 n " t s of f t a. Reccived June 28, 1949 
h. As is well know n. the usefulness oT suk h 
abbreviated scales depends upon the accuracy ' 
with which total score 1O’s can be estimated. —_ < | : MfacP 1 M 
If we take 14.83 as the SD for the typical . Wr oer. H A ps id ne ! he 
distribution of Wechsler-Bellevue IQ’s, the [Q's of subnormal white adults. J. P 
error of estimate in IQ points for the best team 21, 81-89 
of two is 6.9, of three is 6.1, of four is 5.4, of . * sx - sit ~ 
five is 4.9. Perhaps a few users of abbreviated " ry sais ies ae 
scales need to be reminded that one out of 3. Guyevr M.S. A | m t 
twenty predictions will involve errors twice as Vechsler-Bellevue test. Amer. J. Orthops 
large as the above figures. 1945, 15, 727-732 
c. In general, the “‘validities’ reported in ». on rg teer , , 
the literature for various combinations tend to i. " " 52. | 
run considerably higher than we find for the ; Parrencon CH. A compa 
355 cases of the standardization group. As pre- forms of the Wechsler-Bell 
viously indicated, the explanation for this is Psychol., 1946, 10, 260-267 
the heterogeneity of the groups used to deter- ° on 9 : = : ; these 88 
mine the correlations. There can be no doubt Pp , ma bee 12. ~~ ‘ 2 = c 
oge e — dpi Bie -. 
about the greater dependability of the correla- 7, Rasin, A. I. A short form of the We r-] 
tions based on Wechsler’s group. vue test. J. appl. Psychol., 1943, 27, 320-324. 
d. In making the calculations for this pa- 8. Sprincer, N. N. A short form of the We 
per, it was not necessary to compute the cor- Bellevue intelligen : 
relations for all the possible 627 combinations perMinn. Amer. 7. © 2y- tae 
. , 344. 
in order to arrive at the ten best teams of each 9  Wrocs, R, D. The measurement of adult int 
size. The number of r’s computed was 249. ligence. (3rd Ed.) Baltimore: Williams & W 


Had this task been done by adding subtest kins, 194 








WECHSLER-BELLEVUE RELIABILITY AND THE VALIDITY 
OF CERTAIN DIAGNOSTIC SIGNS OF THE NEUROSES' 


FRANCIS M. GILHOOLY 


FORDHAM UNIVERSITY AND 
VETERANS ADMINISTRATION HOSPITAL, BRONX, NEW YORK 


N a previous paper [1] a study of the re- 

lationship between variability and ability 
on the Wechsler-Bellevue Intelligence Scale 
was presented. Briefly, the results indicated 
that as the extremes of the IQ continuum are 
approached in either direction from the aver- 
age, there is a strong tendency for the varia- 
bility among the subtest scores to decrease. This 
would mean that at the upper extreme as well 
as at the lower, smaller deviations of subtest 
scores from the mean subtest score are signifi- 
cant. 

However, variability, and consequently di- 
agnostic signs and patterns, are valuable only 
insofar as the various subtests of the scale are 
reliable. If the scores are not relatively free 
from errors of measurement, the clinician can- 
not be certain whether the variability (and 
the supposedly diagnostic signs) manifested on 
the test are really indicative of mental disturb- 
ance or are merely due to errors inherent in the 
test itself. 

Although Wechsler [7] does not present 
data for the reliability of the individual sub- 
tests, there are two studies [4, 5] in the litera- 
ture which do. Since these studies are based on 
somewhat similar samplings, that is, mixed 
groups of psychiatric patients in which schizo- 
phrenics were in the majority, and because 

1Published with permission of the Chief Medical 
Director, Department of Medicine and Surgery, 
Veterans Administration, who assumes no responsi- 


bility for the opinions expressed or the conclusions 
drawn by the author. 

Based on one section of a thesis submitted to Ford- 
ham University in partial fulfillment of the re- 
quirements for the degree of Master of Arts. The 
writer gratefully acknowledges his indebtedness to 
Drs. Anne Anastasi and Dorothea McCarthy of 
Fordham University, to Dr. Joseph Levi, former 
Chief Psychologist of the Kingsbridge Veterans 
Hospital, New York, and to Dr. David Wechsler, 
consultant to the Veterans Administration. 
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both employed the test-retest method of treat- 
ing the data, their results should be directly 
comparable. Yet, differences in the calculated 
reliabilities, as high as .24 points for the Simi- 
larities subtest, exist between the findings of 
the two investigations. 

In the interpretation of these differences, 
one consideration must be borne in mind, that 
is, that the subjects of the two studies were all 
psychiatric patients, the majority being psy- 
cotic. It is a well-known fact that the mental 
health of such subjects is constantly fluctuating, 
some showing improvement, and some, especi- 
ally in the schizophrenic group, regressing fur- 
ther in their disease process. The consistently 
higher results obtained by Hamister may be 
due to the fact that the mean retest interval of 
his subjects was less than one month, while 
Rabin’s mean retest interval was slightly over 
13 months. The retest correlations of the lat- 
ter study may reflect, in part, certain broad 
behavior changes associated with the mental 
disorder, rather than errors affecting test 
scores. This raises a debatable point in the in- 
terpretation of retest reliabilities, a difficulty 
not present in split-half reliabilities since this 
method reduces the time interval to zero. 

It is one of the purposes of this paper to 
present the results of an investigation of the 
split-half reliabilities of four subtests of the 
Wechsler-Bellevue Intelligence Scale, Form I, 
namely, Information, Comprehension, Simi- 
larities, and Vocabulary. 

Following the publication of the Bellevue 
Scale and the claims of its author as to its diag- 
nostic potentialities, many other authors, in 
seeking to verify its usefulness, have discovered 
diagnostic signs of their own. Perhaps the most 
notable in this area is Rapaport’s work [6]. 
Since the data collected in the present study 
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consisted of the Wechsler-Bellevue records of 
122 hospital patients diagnosed as psychoneu- 
rotics, they were well suited for an investiga- 
tion of the validity of certain of Rapaport’s 
signs of the neuroses. It is the second purpose 
of this paper to present these results. 


SUBJECTS 


The subjects for this investigation consist 
entirely of white, male veterans of World 
War II. Each of the 122 subjects was a pat- 
ient in the neuropsychiatric section of the 
Kingsbridge Veterans Hospital, Bronx, New 
York. The criteria used in the selection of all 
cases were: 


1. Final, or discharge, diagnosis of psychoneuro- 
sis. Cases with psychoneurosis superimposed upon 
another syndrome were discarded. The clinical psy- 
chologist’s case report was not utilized in any way 
in order to avoid a biased sampling since his diag- 
nosis, at least in part, was based on the technique 
under investigation. 

2. Age range from 19 to 35, inclusive. Age was 
limited to 35 or less in order to eliminate as far as 
possible the effects of decreasing efficiency due to 
advancing age. 

3. Each subject must have received a Wechsler- 
Bellevue Intelligence Test, Form I, as part of his 
diagnostic procedure. Only those cases to whom the 


entire eleven subtests were administered were ac- 
ceptable. 


The age of this group ranges from 19 to 
35 years with a mean of 27.08 and a standard 
deviation of 4.35 years. The range of intelli- 
gence quotients is from 75 to 135, with a 
mean of 107.19 and a standard deviation of 
13.48. The IQ distribution, in terms of 
Wechsler’s classification, is as follows: Very 
Superior, 11; Superior, 11; Bright Normal, 
31; Average, 57; Dull Normal, 10; Border- 


line, 2. 

A comparison of this group with Wechsler’s 
standardization population shows the present 
sampling to excel in educational achievement. 
While approximately 62 per cent of Wechs- 
ler’s sampling never entered high school, only 
25 per cent of the subjects of this study left 
school before the ninth grade. 

For a more complete description of the sub- 
ject group upon which this investigation is 
based, the reader is referred to the original 
work [2]. 


THE RELIABILITY OF FOUR BELLEVUE 
SCALE SUBTESTS 


The split-half method was used to estimate 
the reliability of four of the subtests of the 
Bellevue Scale because it reduces the time in- 
terval between the two tests to zero and also 
eliminates any possible practice effects. It 
yields the accuracy of the scores at the time 
the individuals are measured. As indicated 
above, retest reliabilities are especially ques- 
tionable when applied to psychiatric patients. 

Four subtests of the Bellevue Scale were 
chosen for this part of the study: Information, 
Comprehension, Similarities, and Vocabulary. 
These particular subtests are especially suited 
for the present study for the following 
reasons : 


1. They are the only subtests of the scale which 
do not have time limits. On this point Guilford [3] 
states: 

Reliability indices of the split-half variety are 
most meaningful when derived from the kind of 
test or examination where every individual is al- 
lowed sufficient time to attempt every item. When 
reliability coefficients are determined for tests with 
a strict time limit, they should always be interpreted 
with caution. 

2. Three of the subtests — Comprehension, Sim- 
ilarities, and Vocabulary — require that the sub- 
ject’s answers be written out. This allows the an- 
swers to be rechecked and evaluated accordingly. 

3. With the exception of Similarities, these sub- 
tests are included in Wechsler’s “Hold” group. This 
is a group of four subtests which decline least with 
age [7]. 


For each of the subtests reliability coefh- 
cients were found by correlating scores on odd 
and even items, and then estimating the relia- 
bility of the entire subtest by the Spearman- 
Brown formula. The correlations between the 
split-half scores and the estimated reliability 
coefficients are presented in Table 1. 


TABLE 1 


HALF-Test CORRELATIONS AND EsTIMATeD RELIABIL- 
Iry COEFFICIENTS OF BELLeEvUE SCALE SUBTESTS 


Limits of Accuracy 


T11/22 on at the .01 Level 
Information 66 .79 683 - .87 
Comprehension 39 56 .37 - .70 
Similarities 62 .77 65 - 85 
Vocabulary 89 .94 91 - .96 
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Inspection of Table 1 indicates that Vocabu- 
lary alone has a reliability coefficient above 
.90, high enough to justify its use as a diag- 
nostic tool. Comprehension is particularly un- 
reliable, while Similarities and Information 
occupy an intermediate position. The limits of 
accuracy of the reliability coefficients are pre- 
sented in the last column of Table 1. These 
limits give the range, at the .01 level of con- 
fidence, within which another investigator 
testing a comparable sample of neurotic pa- 
tients could expect his results to fall. Thus, 
the reliability coefficient of the Comprehension 
subtest could be as low as .37 or as high as .70, 
due to chance factors of sampling. The In- 
formation and Similarities subtest reliability 
coefficients could vary as much as .20. Vocabu- 
lary alone can be expected to yield results 
which are highly consistent from sample to 
sample. 

The evaluation and interpretation of a re- 
liability coefficient should take into considera- 
tion the purpose for which the test is to be 
used. The scores obtained on these particular 
subtests are frequently utilized as indicators 
of mental dysfunction if they deviate from the 
subtest mean by two or more weighted points 
[7]. Such being the case, the reliability co- 
efficients should be extremely high, that is, the 
scores obtained by any individual should vary 
only slightly from his hypothetical true scores. 
To determine how much variation could be 
expected by chance if these subjects were re- 
tested, the standard errors of the obtained 
scores were calculated. Table 2 gives the 
standard deviations of the distributions of sub- 
test scores and the standard errors of the sub- 
test scores. 

The standard error denotes the amount of 


TABLE 2 


STANDARD DEVIATIONS OF DISTRIBUTIONS AND STAND- 
ARD Errors OF SCORES ON Four SUBTESTS 








SE of variation of score 
SD Scores at the .01 level 
Information 2.44 1.12 5.78 
Comprehension 2.56 1.69 8.72 
Similarities 3.00 1.44 7.43 
Vocabulary 2.78 0.67 3.46 





fluctuation that can be expected in test scores 
if the same subject were retested under simi- 
lar conditions. Because there are errors of 
measurement which affect test scores, the true 
score of known. 
Rather, the range within which the true score 
probably lies is determined. At the .01 level, 
this range is limited by 2.58 standard errors 


any person can never be 


on either side of the obtained score. The range 
of fluctuation that can be expected in the 
scores of the four subtests under investigation 
is given in the third column of Table 2. From 
the size of these figures, it can be seen that 
errors of measurement should play a large 
part in the interpretation of subtest deviations 
of the Bellevue Scale. 

As an example, on the Comprehension sub- 
test there is a range of 8.72 weighted score 
points within which a subject’s true score could 
fluctuate due to errors of measurement. ‘hus, 
a person attaining a Comprehension score of 
ten could conceivably vary on retest from a 
score of six to fourteen, due to imperfections 
of the measuring instrument. 

The minimum differences, in terms of 
weighted score points?, that would be neces- 
sary between pairs of subtests for statistical 
significance were calculated. These differences, 
expressed in terms of SD units, as well as in 
terms of weighted score points, are presented 


TABLE 3 
MINIMUM DIFFERENCES BETWEEN PAIRS OF SUBTEST 
Scores NECESSARY FOR SIGNIFICANCE 
AT THE .01 LEVEL 


Difference 





Difference 
in Standard in “Weighted 
Subtest Pair Scores Score” Points 
Vocabulary — 
Information 1.34 SD 4.02 
Vocabulary — 
Similarities 1.39 SD 4.17 
Vocabulary — 
Comprehension 1.82 SD 5.46 
Information — 
Comprehension 2.08 SD 6.24 
Information — 
Similarities 1.70 SD 5.10 
Similarities — 
Comprehension 2.11 SD 6.33 


2The units in which Wechsler-Be!llevue scores are 
expressed in the Wechsler manual! [7]. 
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in Table 3. The standard deviation of each 
subtest of the Bellevue Scale is equivalent to 
three weighted score points [7, p. 219]. Thus, 
from Table 3, it can be seen that the obtained 
scores on the Similarities and Comprehension 
subtests would have to differ by at least six 
weighted points before it could be said that a 
significant difference exists between the func- 
tions measured by the two. Deviations from 
the Vocabulary level, which is frequently 
taken as an estimate of optimal mental func- 
tioning, would need to be at least four weight- 
ed points. Since the subtests employed in this 
investigation are generally considered to be the 
more stable of the eleven subtests of the Full 
Scale, it is reasonable to expect that even 
greater differences exist among the other, less 
reliable subtests. 

These results are in marked disagreement 
with Wechsler’s claim that deviations of two 
or three weighted score points among subtest 
scores constitute significant signs of mental 
disorder. The variability that is manifest on 
any Bellevue record can as easily and as justi- 
fiably be attributed to errors of measurement 
as to variability caused by personality disturb- 
ances. It would appear that the value of diag- 
nostic signs based on deviations of the sub- 
tests from the individual mean has been great- 
ly overestimated. 


RAPAPORT’S SIGNS OF THE NEUROSES 

Rapaport [6], on the basis of his investiga- 
tion of the diagnostic potentialities of the 
Bellevue Scale, determined scatter patterns 
and diagnostic signs for several clinical groups. 
Several of his signs refer specifically to the 
neuroses. If these signs are diagnostic of the 
neuroses, as claimed, they should be present in 
a very large percentage of the cases under 
consideration because all of the subjects of 
this study were discharged with diagnoses of 
psychoneuroses. This presupposes comparabil- 
ity of Rapaport’s subjects and the subjects of 
the present study only insofar as diagnosis is 
concerned. The conclusions drawn by Rapaport 
are not qualified by the age, sex, or color of the 
subjects. In order to keep the diagnoses as 
similar as possible, and to avoid the entangle- 
ments of reclassification, only those syndromes 
which were clear-cut, by virtue of discharge 


diagnosis, were utilized in this area of the 
study. 

After tabulation of the diagnoses, as taken 
from the hospital files, it was found that a 
sufficiently large number of cases could be 
mustered to permit the validation of two 
classes of signs: those pertaining to the neu- 
roses in general, and secondly, those signs of 
the anxiety neuroses. 

The signs described below were tallied as 
present in any record only if the exact criteria, 
as stated in Rapaport’s text, were satisfied. 
Otherwise, they were counted as absent. 


Signs Pertaining to the General Clinical Class- 
ification of Neurosis. 

1. Normals and Neurotics with the excep- 
tion of the Obsessive-Compulsives have rela- 
tively few cases of scatter of Comprehension 
below the Modified Verbal Mean [6, p, 128]. 

Since all of the subjects of this study were 
diagnosed as psychoneurotics and none of the 
cases was diagnosed as primarily Obsessive- 
Compulsive, the number of cases upon which 
validation of this sign was based was 122. By 
“scatter of Comprehension below the Modified 
Verbal Mean” is meant drops in this score of 
three or more weighted score points. 

Of the 122 cases, 117, or 96.7 per cent, 
agree with this criterion, that is, only 3.3 per 
cent have drops in Comprehension score three 
or more points below the Modified Verbal 
Mean. In fact, in 84 or 69 per cent of the 
cases, the Comprehension subtest score is as 
high or higher than the Modified Verbal 
Mean. For the cases studied, agreement with 
Rapaport’s conclusion is indicated. 

2. Weighted scores of seven or less (on 
Block Designs) are extremely rare in Normals 
or Non-Depressed Neurotics: this attests to 
the great stability of Block Designs when de- 
pressive trends or psychotic confusion are ab- 
sent [6, p. 288]. 

In the tabulation of this sign four cases with 
depression mentioned in the discharge diagnos- 
is were eliminated, leaving a group of 118 
cases. Of these, only 22 or 18.6 per cent have 
Block Design scores of seven or less. This 
would appear to be an adequate validation of 
the conclusion reached by Rapaport. However, 
lest the figures be misleading, it must be men- 
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tioned that every one of these 22 cases having 
extremely low scores on Block Designs is 
below the mean intellectual level of the group. 
The mean IQ of these cases is 89.9 as com- 
pared to 107.7 for the entire group. These 
22 cases form 53 per cent of the total cases 
with IQ below 105. Scores of seven or 
less on Block Design are thus not extrem- 
ely low in themselves, but are low with 
low IQ. It is felt that such a general con- 
clusion as Rapaport’s, without reference to 
some criterion such as the Vocabulary level or 
the Mean Verbal Level, means nothing. ‘This 
sign is apparently more directly related to 
lower intellectual ability than it is to the neu- 
roses. 

Although these two signs of the neuroses 
are supported by the results of this study, it 
must be noted that Rapaport does not believe 
they differentiate adequately between Neu- 
rotics and Normals. Because they may also be 
rare in the general population, their diagnostic 
value is accordingly considerably reduced. 


Rapaport’s Diagnostic Signs of Anxiety 


In the Anxiety group were placed 25 cases 
diagnosed as “‘Psychoneuroses, anxiety reac- 
tion,” 26 cases of ‘‘Psychoneurosis, anxiety 
state,” and one case of ‘‘Psychoneurosis, re- 
mittant anxiety,” making a total of 52 cases 
on which the validation of the following two 
signs was based. 

1. A Digit Span score much below the 
Vocabulary level and/or the Mean Verbal 
level is mainly indicative of the presence of 
anxiety [6, p. 193]. 

A drop of Digit Span of four or more 
points is considered to be significant. In the 
Anxiety Group only 12 or 23 per cent of the 
cases exhibit Rapaport’s sign. It is quite sig- 
nificant that for a greater number of the cases, 
38 per cent, the Digit Span score, instead of 
being lower than Vocabulary, is as high or 
higher than Vocabulary. The incidence of 
this sign for the total Neurotic Group, N= 
122, and the entire subject group minus the 
Anxiety Group, N=70, were also determined. 
The percentages were found to be approxi- 
mately the same as those above, that is, the 
sign was present in 20 per cent of the total 
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group minus the Anxiety Group, and in 21.5 
per cent of the total Neurotic Group. 

The figures derived from this study differ 
sharply from those reported by Rapaport for 
scatter of Digit Span below Vocabulary. These 
findings do not support Rapaport’s conclusion 
that when anxiety is present, Digit Span drops 
significantly. 

2. Impaired efficiency on the Object As- 
sembly subtest may be a reflection of depressive 
or anxiety trends or both. If the impaired eff- 
ciency is essentially the consequence of anxiety, 
the score will not only be significantly below 
the Vocabulary level, but also below the level 
of the other Performance subtest scores which 
do not appear especially vulnerable to en- 
croachment by anxiety [6, p. 270]. 

By a significant lowering Rapaport once 
again means a drop of four weighted score 
points below the Vocabulary level. For signifi- 
cance, a deviation of two points is required be- 
tween the Object Assembly score and the Per- 
formance Mean, calculated from the other 
four Performance subtests—Picture Arrange- 
ment, Picture Completion, Block Designs, and 
Digit Symbols. 

The incidence of these signs was, once again, 
determined for three groups: the Anxiety 
Group, N=52; the total subject group minus 
the Anxiety Group, N=70; and for the total 
Neurotic Group, N=122. 

This sign of anxiety, as measured by the de- 
viation of the Object Assembly subtest from 
the Vocabulary level, is absent in 89 per cent 
of the cases of the Anxiety Group, in 87 per 
cent of the total subject group minus the 
Anxiety Group, and in 88 per cent of the 
total Neurotic Group. When the scatter of 
Object Assembly is measured from the Modi- 
fied Performance Mean the results are in even 
greater disagreement with Rapaport’s conclu- 
sion, the sign being absent in 96 per cent of the 
Anxiety Group, in 87 per cent of the total 
subject group minus the Anxiety Group, and 
in 91 per cent of the total Neurotic Group. 

The present study does not support Rapa- 
port’s claims. 


SUMMARY AND CONCLUSIONS 


It was the purpose of this paper to present 
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the data resulting from an investigation of : 

1. the split-half reliability of four of the 
subtests of the Wechsler-Bellevue Intelligence 
Scale, Information, Comprehension, Similari- 
ties, and Vocabulary; 

2. the validity of four of Rapaport’s signs 
of the neuroses. 

Using the Spearman-Brown formula, the re- 
liability coefficients of the four subtests were 
calculated to be: Information .79, Compre- 
hension .56, Similarities .77, and Vocabulary 
.94. To evaluate the magnitude of the reliabili- 
ty coefficients the standard errors of measure- 
ment were calculated. With the exception of 
the Vocabulary subtest, the estimated range of 
fluctuation of each individual’s score which 
could be attributed to chance errors was found 
to exceed the amount of deviation necessary 
for significance by Wechsler’s standards. The 
differences in weighted subtest scores neces- 
sary for significant deviations between pairs of 
subtests were also determined. The smallest 
difference was found to be four weighted 
score points—which is in marked disagreement 
with Wechsler’s standards. The amount of 
fluctuation in subtest scores which can be ex- 
pected to occur because of errors of measure- 
ment is so high that the usefulness of diag- 
nostic signs based on variability in subtest 
scores is highly questionable, at least for the 
group studied. 

An evaluation of four of Rapaport’s signs of 
the neuroses found them lacking in this group 
of neurotics. In view of the conclusions reached 
in the study of subtest reliability, this should 
not seem surprising. Errors of measurement 
are so great that no definite pattern can ap- 
parently be predicted for clinical groups. ‘The 
occurences of similarities in variability signs or 
patterns must be attributed to chance factors. 
Evidently Rapaport’s conclusions apply to his 
own group of neurotics and cannot be trans- 
ferred in toto to a different group. 

Since most of the studies in the literature 
dealing with variability on the Bellevue Scale 
contrast clinical groups, such as comparing a 
group of neurotics with a group of schizo- 
phrenics, the result may well be an indication 
of composite group trends and not of indi- 
vidual response patterns. An example of this 


particular point is Rapaport’s “group scatter- 
gram” which represents the average Vocabu- 
lary scatter of all the individuals in the group 
on each of the subtests. The following quota- 
tion from his text illustrates the impossibility 
of attempting to fit an individual’s scatter 
pattern to that of a group in which individual 
differences have been cancelled out by the 
averaging of test scores or deviations: 

We attempted to find in our material individual 
scattergrams for each clinical group which would 
contain all the features we found 


of that group. No such individual 
were found... . [6, pp. 299-300]. 


Many factors, such as errors of measure- 


characteristic 
scattergrams 


ment, distractions, inefficiencies due to mental 
disturbance, individual differences even among 
the normal population, and individual differ- 
ences among the examiners, may all combine in 
such a way as to almost certainly preclude the 
fitting of an individual’s results into a group 
pattern. 

The study leads to a conclusion which is be- 
coming more and more apparent in clinical 
practice: the quantitative subtest scores merely 
give a quick estimate of the subject’s various 
ability levels, and it is only clinical acumen in 
the qualitative analysis of the performance and 
behavior of the subject on each of the subtests 
which leads to accurate diagnoses. 


Received May 23, 1949. 
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AN ITEM ANALYSIS OF THE WECHSLER-BELLEVUE TESTS 


JOSEPH JASTAK 


DELAWARE STATE HOSPITAL 


HE items of a psychometric scale are 

usually arranged in a continuous series 
beginning with the easiest and ending with 
the most difficult item. The progression rough- 
ly corresponds to the percentage of successes 
obtained by the group on which the scale was 
standardized. This system of placement has 
been used by Wechsler in the Bellevue Scale 
[3]. 

Rapaport [2] made an item analysis of the 
Bellevue tests and found orders of difficulty 
somewhat different from those of the Wechs- 
ler record blank. In daily clinical use, marked 
deviations from the order established by 
Wechsler for some of the subscales of his bat- 
tery are common. 

There are several reasons why the items of 
a scale should be arranged in a relatively 
stable order, well-graded according to difh- 
culty. One of them is economy and ease of 
testing. Another is to set reasonable limits of 
individual testing. 

The most compelling reason for accurate 
scale grading is to enable us to derive clinically 
meaningful measures of intratest scattering. 
It is also important that beginners in clinical 
psychology who are learning to use the Belle- 
vue Scale be made aware of the irregular suc- 
cession of the items of the subscales. Other- 
wise, their examinations may be incomplete 
and their results unfair to the subjects tested. 

The positional measure of an item in terms 
of per cent passing is neither constant nor ab- 
solute. It varies with a number of factors, 
constant or accidental, psychological or envi- 
ronmental, intellectual or nonintellectual. Mc- 
Nemar [1], studying the wide spread of indi- 
vidual Stanford-Binet performances, explains 
that item unreliability, lack of steepness of dif- 
ferentiation, varying rates of mental growth, 
sex differences, group and specific factors may 
be responsible for extreme response inconsisten- 


cies. Item difficulty is also related to the prob- 
blem of personality disturbances which influ- 
ence all test responses. 

The nature of the item may be such as to 
favor periodic variations in difficulty. 


For example, the information questions as to who 
the president of the United States now is and who 
his predecessor was would have to be standardized 
for each year of presidential term of office to be 
properly evaluated. The difficulty of the item con- 
cerning the population of the United States varies 
with the publication of census figures and estimates 
of population increases. George Washington’s birth- 
day is known to more people when the question is 
asked in February than it is at any other time of 
the year. The question is easier for those persons 
whose own birthdays fall in February than it is for 
those born in other months. 


Wechsler [3] reports that World War II has 
changed the difficulty of the vocabulary items “nitro- 
glycerine, espionage, and harakiri.” The last war 
has had a significant effect on the difficulty of a great 
many items of the Bellevue information scale. It is 
probable that this scale is easier for the majority of 
people now than it was during the period of stand- 
ardization. There is evidence that the difficulty of 
the following of the information scale has 
been reduced: “rubber, London, Italy, Japan, plane, 
Brazil, Paris, Egypt, Vatican.” 


items 


The manner of scoring and of counting 
successes may result in purely technical dis- 
placements in the order of difficulty. 


For example, the word “affliction” was correctly 
defined by 742 persons in our group of 1172 cases. 
But 704 of 742 earned half credit for their defini- 
tions. Only 488 persons of 1172 were able to define 
the word “guillotine.” Over 90 per cent of them 
earned full credit for their definitions. If partial 
credits are used in the determination of difficulty, 
the word “affliction” turns out to be much more 
dificult than the word “guillotine,” even though 
254 more persons responded correctly to the former 
than the latter word. The amount of credit earned 
is more a matter of the nature of the stimulus and 
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the scoring criteria than it is a function of item 
difficulty. Some intellectually superior persons give 
inferior word definitions because of anxiety, delu- 
sions, emotional irrelevancies and projections. 


The ordinal consistency of test items is in- 
creased when partial and qualitative scores are 
disregarded in the final analysis. For this 
reason, we have used the all-or-none system in 
reporting the percentages of successes for the 
Bellevue subscales. 

The item analysis was made on 1600 cases, 
all examined at the Delaware State Hospital 
and Mental Hygiene Clinic. The tests had 
been administered by twenty-three different 
examiners between the years 1940 and 1948 
inclusive. The records analyzed' were those of 
580 female applicants to nursing schools, 166 
male hospital attendants and employees, 134 
female hospital attendants and employees, 200 
female adult patients, 200 male adult patients, 
220 boys and 100 girls between the ages 11 
and 16. ‘The adult patients included all psy- 
chiatric categories. ‘The children had been re- 
ferred to the clinic for examination and treat- 
ment because of behavior difficulties and de- 
linquencies. 

The percentage of successes was obtained 
for each of the seven groups and for each item 

1The writer wishes to acknowledge the assistance 
received from other psychologists of our clinic in 
transferring the item scores from the Record Blank 
to statistical cards. Dr. E. S. Vik tabulated 80 cases; 
Dr. V. V. Spaulding 105 cases; Mr. R. Robison 80 
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of the vocabulary, information, comprehension, 
similarities, picture completion, picture ar- 
block design, object assembly, 


arithmetic, and digit span tests. Wherever a 


rangement, 


test is scored on the basis of quality, the per- 
centage for each possible score of that item was 
obtained. In distinguishing between right and 
Wechsler’s 


methods of scoring were adhered to throughout 


wrong responses, criteria and 
the entire scale. As was stated before, the final 
ranks of the items of each scale are based on 
the all-or-none principle of success. 

Unusual findings of the intraitem analysis 
will be mentioned in the text. 

‘The items of each subscale were ranked in 
order of difficulty for each of the seven clinical 
groups. Ihe ranks were then correlated with 
and with 
rank-differ- 


containing 10 or 


each other, with Wechsler’s order, 
our final order by the method of 


ences (rho) for each scale 


more items (vocabulary, information, compre- 


hension, similarities, arithmetic, and picture 
ympletion ). 
RESULTS 
hough it was not our primary aim to 


study the factors associated with item difficulty 
or response variability, a comparison of the sev- 
en clinical groups was nevertheless made to 
determine the degree of consistency of item 
ranks in both normal and abnormal subjects. 
While the spread or range of successful re- 


cases; Mr. and Mrs. M. Whiteman 125 cases. sponses and the total achievements on some 
TABLE 1 
MATRIX OF RANK OrpER CORRELATION COEFFICIENTS FOR THE 
ITEMS OF THE BELLEVUE PICTURE COMPLETION TEST 
Attendants Patients Clini 
Wechsler Nurses Male Female Male Female Boys Girls 
Order N=580 N=166 N=134 N=200 N=200 N=220 N 100 

Our Final Order. .768 .954 .982 .957 .986 .962 957 925 
Wechsler’s 

Ordet...... = a 725 .804 664 732 »78 .761 614 
a - -- — .904 .947 924 950 854 947 
Male 

Attendants...... — — — .968 961 .932 .961 .909 
Female 

Attendants... — — — —_ .950 .968 946 943 
Male Patients... — — _— —_— — .922 .922 903 
Female Patients.. — _— _— — —- — 893 .939 
Clinic Boys... — — — — — — — 886 
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subscales definitely differentiated between our 
normal (nurses, attendants) and abnormal 
(patients) groups, the rank-order correlations 
did not. The inferior scores obtained by pa- 
tients did not affect to any extent the ranks 
of the individual items. 


The matrix of rank-order correlation co- 
efficients for the picture completion subscale 
and our seven population groups is given in 
Table 1. It illustrates the fact that the corre- 
lations between our groups are uniformly and 
conspicuously higher than are the correlations 
between Wechsler’s order and any of our 
groups. 


Because of the great similarity of ranks in 
all our groups, it was decided not to list the 
number and percentage of successes for each 
group and subscale separately. An inspection 
of the correlation tables and of the actual per- 
centages revealed that an item analysis based 
on sex would give more informative results 
than one based on the various clinical categor- 
ies. 

All subscales except the digit-symbol test 
are listed in the order in which they appear on 
Wechsler’s record form. The items of each 
scale are arranged in the new order of diffi- 
culty determined by our analysis. Wechsler’s 
ordinal numbers are given in parentheses fol- 
lowing our numbers. The results are recorded 
in four columns: (1) the number and percent- 
age of successes for the total population of 
1172 cases?; (2) for 586 males; and (3) for 
586 females; (4) test of significance of the 
differences between males and females. Only 
those #-values are listed which were found to 
be statistically significant near or above the one 
per cent level. The letter F preceding the t- 
value indicates that the discrepancy is in favor 
of females, the letter M signifies that the dif- 
ference is in favor of males. Below each table, 
the rank-difference coefficients between the 
ranks for males and females are given. 


Information—Table 2. Five items of the in- 
formation scale yield significant differences be- 
tween the successes of males and females. The 
exact differences between males and females 
cannot be definitely established unless the fac- 


2Only 152 records out of 580 nurses examined were 
used in the determination of item difficulty. 


TABLE 2 
Orper OF DIFFICULTY OF THE ITEMS OF THE 
WECHSLER-BELLEVUE INFORMATION 
ScaLe (N = 1172) 
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n= an Zn ma Ba 

Item N %&% N & N & t 

1. (2) Thermometer 1125 96.0 570 97.3 6555 94.7 M 2.28 
2. ( 3) Rubber 975 83.2 602 85.7 473 80.7 M 2.29 
8. ( 1) President 947 80.8 486 82.9 461 78.7 
4. ( 5) Pints 913 77.9 447 76.38 466 79.5 
5. ( 4) London 867 74.0 422 72.0 445 75.9 
6. ( 6) Weeks 786 67.1 394 67.2 392 66.9 
7. (10) Plane 706 60.2 3868 62.8 338 57.7 
8. (11) Brazil 696 59.4 3839 57.8 357 60.9 
9. ( 9) Height 664 56.6 293 50.0 871 63.3 F 4.63 
10. (13) Heart 654 65.8 326 55.6 328 56.0 
11. (16) Washington 621 53.0 266 45.4 355 60.6 F 65.28 
12. ( 8) Japan 576 49.1 294 50.2 282 48.1 
13. ( 7) Italy 539 46.0 261 44.5 78 47.4 
14, (14) Hamlet 454 38.7 171 29.2 283 48.3 F 6.85 
15. (12) Paris 401 34.2 243 41.5 158 27.0 M 5.29 
16. (19) H. Finn 387 33.0 190 32.4 197 33.6 
17. (20) Vatican 353 30.1 159 27.1 194 38.1 F 2.26 
18. (18) Egypt 322 27.6 174 29.7 148 25.2 
19. (15) Population 201 17.2 141 24.1 60 10.2 M6.44 
20. (17) Pole 160 13.6 93 15.9 67 11.4 M 2.26 
21. (21) Koran 96 8.2 50 8.5 46 17.8 
22. (23) H. Corpus 78 6.6 46 7.8 32 6.5 
23. (22) Faust 37 3.2 18 3.1 19 3.2 
24, (25) Apocrypha 28 17 14 2.4 6 1.0 
25. (24) Ethnology 9 0.8 7 ‘23 2 0.3 








Rho between male and female ranks .964. 
Rho between Wechsler’s order and our own .935. 


tors of intelligence and personality functioning 
are kept constant. 


The third item of the scale (president) varies 
greatly in the number of successes depending on the 
year in which the tests were administered. During 
the final years of the Roosevelt administration the 
item dropped to 6th place in the scale. Its present 
position is an average of the results obtained then 
and the results obtained after President Roosevelt’s 
death when nearly everyone knew who preceded 
President Truman. While items calling for facts 
which are subject to periodic changes are clinically 
valuable, they are qualitatively different from the 
remaining items of this scale. The Bellevue informa- 
tion scale is far less homogenous a test than might 
be supposed. 


Comprehension—Table 3. We agree with 
McNemar [1] that sex differences are due 
chiefly to the nature of the contents of the test 
items. It is apparent from Table 3 that the 
comprehension scale favors the females. The 
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TABLE 3 
Orper oF DIFFICULTY OF THE ITEMS OF THE 


WECHSLER-BELLEVVUE COMPREHENSION 
Scate (N = 1172) 











Order § oe ra ¢ 
4 Pa = 
oe S 3 g f sa Z 
Es s3 98 && 

Bz of 48 gs #6 

7% Oo aa zn fe 37 
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1. ( 3) Company 1066 91.0 537 91.6 629 90.3 

2. ( 1) Envelope 1032 88.0 516 88.0 5616 88.0 

3. ( 2) Theatre 955 81.5 482 82.2 473 80.7 

4. ( 5) Shoes 921 78.6 456 77.8 465 79.4 

5. ( 4) Taxes 886 75.6 442 75.4 444 76.8 

6. ( 8) Laws 683 58.3 314 53.6 369 63.0 F 3.28 

7. ( 6) Land 675 57.6 326 55.6 349 59.6 

8. ( 7) Forest 665 56.7 349 59.6 316 53.9 

9. (10) Deaf 463 39.5 201 34.38 262 44.7 F 3.66 

10. ( 9) Marriage 243 20.7 111 18.9 182 22.5 





Rho between male and female ranks .952. 
Rho between Wechsler’s order and our own .842. 


only item which is easier for men than for 
women concerns finding one’s way out of a 
forest. It approaches significance. The com- 
prehension scale has a dearth of items at the 
upper levels of difficulty. 


Digit Span—Table 4. The digit span test 
is one of the most scalable of all psychometric 
abilities despite its inferior reliability. Meas- 
ures of intratest scatter are best obtained on 


TABLE 4 


OrpDER OF DIFFICULTY OF THE ITEMS OF THE 
WECHSLER-BELLEVUE Dicir SPAN 
ScaLE (N = 1172) 




















a o 
2 - re g 
Order E 3 3 § 3 g . s 
2 n = a “<= 
me sg £8 ES € 
5 z a6 a6 £6 &@ 
Item N @& N &% N & t 
1. (1) Forward 3 1172 100 586 100 686 100 
2. (2) Forword 4 1160 99.0 680 99.0 580 99.0 
3. (3) Forward 5 1031 88.0 510 87.0 6521 88.9 
4. (4) Forward 6 659 56.2 322 564.9 337 57.5 
5. (5) Forward 7 309 26.4 137 23.4 172 29.4 F 2.33 
6. (6) Forward 8 107 9.1 48 8.2 69 10.1 
7. (7) Forward 7 3 38 #4«98 28 «2618 «8S 
1. (1) Backward 2 1149 98.0 569 97.1 580 99.0 
2. (2) Backward 3 1075 91.7 533 91.0 542 92.5 
8. (3) Backward 4 783 66.8 369 638.0 414 70.6 F2.41 
4. (4) Backward 5 387 33.0 168 27.8 224 38.2 F 3.81 
5. (5) Backward6 133 11.8 47 8.0 8614.7 F3.64 
6. (6) Backward 7 40 3.4 11 1.9 29 49 F2.86 
7. (7) Backward 8 6 0.5 8 0.5 3 0.5 





this scale by considering the forward and 
backward series in one continuous scale accord- 
ing to the order of difficulty of the respective 
items. ‘The diagnostic value of the discrep- 
ancies between the forward and backward 
scales, mentioned by Rapaport [2], could thus 
be objectively studied. 


TABLE 5 


OrpDeR OF DIFFICULTY OF THE ITEMS OF THE 
WECHSLER-BELLEVUE ARITHMETIC 





t 











Order $ =~ % 8 
E bd ~“»® = 
of 2% a 
3 3 0643 SS 
’ 2 28 a3 §s #5 
42 9° an an an BD 
Item N % N%& N & t 
-( 1) 644 1142 97.4 674 98.0 668 96.9 
2.( 2)10—4 1098 93.7 649 93.7 549 93.7 
8. ( 4) oranges 772 65.9 388 66.2 384 65.5 
4. ( 3) 26—8 761 64.9 881 65.0 380 64.8 
5. ( 5) miles 746 68.6 $80 64.8 366 62.4 
6. ( 7) sugar 699 59.6 356 60.8 343 658.5 
7. ( 6) 50 —14 648 54.9 325 55.5 318 64.8 
. (8) ear 310 26.4 172 29.4 188 23.5 M 2.30 
9. ( 9) train 135 11.5 82 14.0 53 9.0 M 2.69 
10. (10) men 93 7.9 56 (9.6 


37 63 M 2.09 








Rho between male and female ranks 1.00. 
Rho between Wechsler’s order and our own .976. 


Arithmetic—Table 5. Items 3 to 7 of the 
arithmetic scale lack the desired degree of 
steepness of gradation. There are significant 
gaps between items 7 and 8, 8 and 9. Men tend 
to do slightly better in arithmetic than do 
women. The discrepancies between the sexes 
increase with the difficulty of the items. It is 
the writer’s experience that straight-forward 
computations such as “subtract 7 from 23” 
are clinically more valuable than are items of 
the problem solving and reasoning variety. 


Similarities—T able 6. The ranks of the first 
four items of the similarities subscale are exact- 
ly reversed in comparison with Wechsler’s or- 
der. The first five items of this scale show 
relatively small differences in difficulty. Fe- 
males do consistently better on this test than 
do males except on the 9th item: wood-alcohol. 
Six of the items yield statistically significant 
differences in favor of females. 


V ocabulary—Table 7. Despite the high 
rank-difference coefficient between Wechsler’s 
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TABLE 6 


Orper OF DIFFICULTY OF THE ITEMS OF THE 
WECHSLER-BELLEVUE SIMILARITIES 
ScaLte (N = 1172) 





| 








Order 3 oe 2 g 
. £3 sf 648 of 
7 So = 6 © 
Item N % N %&% N % t 
1. ( 4) Wagon- 
Bicycle 1070 91.3 6529 90.3 541 92.3 
2. ( 3) Dog-Lion 1048 89.4 621 88.9 6527 89.9 
3. { 2) Coat- 
Dress 1037 88.5 610 87.0 527 89.9 
4. ( 1) Orange- 
Banana 1033 88.1 504 86.0 529 90.3 
5. ( 5) Paper- 
Radio 1014 86.5 497 84.8 517 88.2 
6. ( 6) Air-Water 572 48.8 244 41.6 328 56.0 F 4.98 
7. ( 8) Eye-Ear 515 43.9 205 35.0 310 52.9 F 6.28 
8. (9) Ege-Seed 392 33.4 156 26.6 236 40.3 F 5.02 
9. ( 7) Wood- 
Alcohol 314 26.8 160 27.3 154 26.3 
10. (10) Poem- 
Statue 297 25.3 117 20.0 180 30.7 F 4.25 
11. (11) Praise- 
Punish 140 11.9 44 17.5 96 16.4 F 4.73 


12. (12) Fly-Tree 140 11.9 51 8.7 8915.2 F3.65 


4 Rho between ‘male and female ranks .951. 
Rho between Wechsler’s order and our own .909. 


order and our own, the vocabulary test appears, 
in clinical use, to be one of the least adequately 
ordered scales. The greatest discrepancies in 
the successes of men and women occur between 
the 14th and the 33rd words. Ten items show 
statistically significant differences in favor of 
women and three items in favor of men. Wo- 
men obtain higher vocabulary scores than do 
men because of the greater affinity of its con- 
tents to the cultural interests of women. If the 
number of concepts in line with male interests 
were increased, the differences between vocab- 
ularv scores of men and women might be elim- 
inated even though separate items would still 
distinguish between male and female patterns 
of interest. 

Picture Arrangement—Table 8. The agree- 
ment between Wechsler’s order and our own 
is complete in this scale. However, the test 
lacks an adequate range of difficulty. The 
second and third items as well as the fourth 
and fifth items lack the desired steepness of 
dificulty. The “taxi” and the “fish” series re- 





TABLE 7 
ORDER OF DIFFICULTY OF THE ITEMS OF THE 
WECHSLER-BELLEVUE VOCABULARY 
ScaLe (N = 1172) 
Order g a : " S . 5 
S a ~ & 2 eae 
a 2 & a2 8 $s 
2 28 as § 8 #& 
4 oO a an & n BD 
Iter N tf N N N % t 
1. ( 1) Apple 1170 99.8 586 100 584 99.6 
2. ( 7) Cushion 1163 99.2 683 99.5 580 99.0 
3. ( 2) Donkey 1162 99.1 583 99.5 579 98.8 
4. ( 6) Fur 1160 99.0 582 99.3 578 98.6 
5. (11) Nail 1152 98.3 581 99.1 571 97.4 
6. ( 5) Nuisance 1143 97.5 573 97.8 570 97.3 
7. (10) Bacon 1140 97.3 564 96.2 576 98.3 
8. ( 4) Diamond 1133 96.7 662 95.9 571 97.4 
9. ( 3) Join 1132 96.6 568 96.9 564 96.2 
10. ( 9) Gamble 1127 96.2 670 97.3 557 95.2 
11. (12) Cedar 1107 94.4 647 93.3 560 95.6 
12. (16) Brim 945 83.1 487 83.1 487 83.1 
13. ( 8) Shilling 945 80.6 467 79.1 478 81.6 
14, (20) Nitro- 
glycerine 821 70.0 463 79.0 358 61.1 M6.83 
15. (22) Microscope 821 70.0 393 67.1 428 73.0 
16. (14) Armory 794 67.7 399 68.1 395 67.4 
17. (26) Affliction 742 63.3 331 56.5 411 70.1 F 4.87 
18. (18) aint 735 62.7 266 45.2 470 80.2 F 13.31 
1¥. (io) baple 690 58.9 297 50.7 393 67.1 F 5.80 
20. (18) Plural 640 54.6 259 44.2 381 65.0 F 7.32 
21. (21) Stanza 629 53.7 258 44.0 371 63.3 F 7.26 
22. (24) Belfry 594 60.7 273 46.6 321 54.8 F 2.82 
23. (17) Guillotine 488 41.6 2387 40.4 251 42.8 
24. (19) Seclude 468 39.9 187 31.9 281 48.0 F6.71 
25. (27) Pewter 442 37.7 204 34.8 238 40.6 
26. (25) Recede 369 31.5 147 25.1 222 37.9 F 4.76 
27. (23) Vesper 344 29.4 117 20.0 227 38.7 F7.19 
28. (31) Espionage 328 28.0 178 30.4 150 25.6 
29. (34) Harakiri 326 27.8 221 37.7 105 17.9 M7.76 
30. (29) Catacomb 227 19.4 100 17.1 127 21.7 
31. (33) Mantis 227 19.4 105 17.9 122 20.8 
2. (28) Ballast 222 18.9 156 26.6 66 11.3 M 6.80 
33. (30) Spangle 199 17.0 44 7.5 155 26.4 F8.92 


34. (32) Imminent 115 9.8 58 9.9 57 9.7 


35. (35) Chattel 87 7.4 42 17.2 45 17.7 
36. (36) Dilatory 67 5.7 25 4.3 42 17.2 
37. (40) Aseptic @ 25 12383 MM 89 
38. (41) Flout 24 2.0 17 2.9 7 i323 
39. (38) Proselyte 17 1.4 10 1.7 7 oh 
40. (37) Amanuensis 12 1.0 7 a2 5 0.8 
41. (4°) Traduce 5 0.4 3 0.5 2 0.3 
42. (39) Moiety 3 6.2 3 0.5 0 6.0 





Rho between male and female ranks .975. 
Rho between Wechsler’s order and our own .951. 


ceive a time bonus. Only 91 cases or 14.5 per 
cent of those who passed the “taxi” item earn- 
ed time credit. Of those who passed the “fish” 
item, 116 or 29.4 per cent earned additional 
time credits. Sixty-five per cent of those who 
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TABLE 8 
OrperR OF DIFFICULTY OF THE ITEMS OF THE 
WECHSLER-BELLEVUE PicruRE ARRANGE- 
MENT SCALE (N = 1172) 











a oe 

Order 2 = 2 8 
Es A ys 2 
os 2% 8 ¢ « 

. Es o.°06CU Ze CUS 

’ 2 = 8 as 5s 8s 

2 Oo an zn mn BD 

Item N % N &% N % t 

1. (1) House 1102 94.0 553 94.4 649 93.7 

2. (2) Hold-up 947 80.8 478 81.6 469 80.9 

3. (3) Elevator 941 79.4 465 79.4 466 79.5 

4. (4) Flirt 636 54.3 282 48.1 354 60.4 F 4.26 

5. (5) Taxi 624 53.2 297 50.7 327 55.8 F 2.84 

6. (6) Fish 401 34.2 166 28.3 235 40.1 F 4.29 





Rho between male and female ranks .944, 


passed the “taxi” item, earned a credit of only 
one point. Despite the time bonus added to the 
last two items and the qualitative scoring of 
the last three items, this scale tends to yield 
highly variable results. The qualitative scoring 
is inapplicable to a relatively large number of 
cases. The contents of the last three items are 
definitely more congenial to the thinking of 
women than men. 


Picture Completion—Table 9. Considerable 
discrepancies exist between our order and that 
of Wechsler in the picture completion test. 
Furthermore, striking sex differences are found 
in the successes on different items of this scale. 
Items involving human figures are done better 
by females than males. Items involving me- 
chanical objects and nonhuman elements are 
easier for men than they are for women. The 
first four items of this scale are exactly reversed 
in order of difficulty for males and females. 
The “hands of a clock” item is the easiest of 
the entire series in case of men. It comes fourth 
in case of women and ninth on Wechsler’s 
test blank. 

Block Designs — Table 10. The fourth 4- 
block item is more difficult according to our 
results than is the first 9-block item. The 
shortening of time allowed for the latter item 
would make that design a far more discrimi- 
nating item than it now is. It would also fill 
the gap between the fifth and sixth items. The 
creditless time periods of the Bellevue per- 


TABLE 9 
ORDER OF DIFFICULTY OF THE ITEMS OF THE 
WEeECHSLER-BELLEVUE Picrure CoMPLE- 
TION SCALE (N = 1172) 











Order By = 2% 8 
sy 88 af Ji 
b — & aed — 
fh sf ai 3% 
s = = 3 45 55 3» 
Z © an za ma BH 
Item N Io N & N % t 
1. (1) Nose 1047 89.3 6507 86.5 540 92.2 F 43.18 
2. ( 2) Mustache 1046 89.2 618 88.4 628 90.1 
3. ( 6) Tail 1042 88.9 524 89.4 618 88.4 
4. ( 9) Hand 1015 86.6 537 91.6 478 81.4 M5.08 
5. ( 3) Ear 938 80.0 467 79.7 471 80.4 
6. ( 8) Knob 914 78.0 478 81.6 436 74.4 M 2.99 
7. (12) Tie 822 70.1 437 74.6 385 65.7 M 3.356 
8. ( 4) Diamond 688 58.7 362 61.8 326 55.6 M 2.17 
9. ( 5) Leg 653 55.7 320 54.6 333 56.8 
10. (10) Water 614 52.4 294 50.2 320 54.6 
11. ( 7) Stack 555 47.4 327 65.8 228 38.9 M 5.89 
12. (13) Thread 330 28.2 222 37.9 108 14.4 M 7.62 


13. (15) Shadow 
14. (11) Image 
15. (14) Brow 


273 23.3 142 24.2 181 22.4 
220 18.8 102 17.4 118 20.1 
140 11.9 50 85 9015.4 F 3.67 





Rho between male and female ranks .932. 
Rho between Wechsler’s order and our own .768. 


formance tests are generally too long. They 
accommodate the fumblers and laggards and 
thereby lose some of their differentiating power 
between normals and abnormals. 


The percentages of successful performances on the 
block design test are somewhat higher for men than 
for women in five out of seven items. Yet women 
tend to get higher total scores on this test than do 
min. The reason for this is that women are speedier 
than men and earn greater time credits. The fol- 


TABLE 10 

OrDER OF DIFFICULTY OF THE ITEMS OF THE 
WECHSLER-BELLEVUE BLOCK DESIGN 

ScaLe (N = 1172) 








rm © 
| — “x > 
z So to iv) — 
he 4 a i= 
a o & Z = @ *<¢ 
<= - & 2 ¢ + ~ om 
z ae 2° 28 ¢e 
< eS 25 55s sy 
~ Sa =n mn a7 
Item Order N % N To N % t 
1. (a) 1097 93.6 546 93.2 551 94.0 
2. (2) 1059 89.6 525 89.6 525 89.6 
3. (2) 982 83.8 499 85.2 483 82.4 
4. (5) 840 71.7 430 73.4 410 70.0 
5. (4) 791 67.5 401 68.4 390 66.6 
6. (6) 389 33.2 211 36.0 178 30.4 M2.04 


(7) 292 24.9 139 23.7 153 26.1 


Rho between male and female ranks 1.00. 
Rho between Wechsler’s order and our own .964. 
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lowing significant or near significant differences 
were found between males and females at the vari- 
ous credit levels. First item: credit 6 — F 2.84; cred- 
it 5 — F 4.89; credit 4 — M 3.04; credit 3 — M 
2.87. Second item: credit 5 — F 3.70; credit 3 — M 
2.93. Third item: credit 5 — F 2.17; credit 3 — M 
3.90. Fourth item: credit 4 — F 2.51; credit 3 — M 
3.31. Fifth item: credit 6 — F 3.46; credit 5 — F 
3.49; credit 3 — M 5.28. Sixth item: credit 3 — M 
2.49. Seventh item: credit 6 — F 2.73. 


TABLE 11 

Orper OF DIFFICULTY OF THE ITEMS OF THE 
WECHSLER-BELLEVUE Opjyect ASSEMBLY 

ScaLte (N = 1172) 











5 =~ 

z r. J 

of ~s 8 ¢ 

° 2 na = 8 

= $s es 

2s as 5s 
Order Sa an me ~ 
New Old Item N % N % N & 
1. (1) Man 1172 100 586 100 586 100 
2. (2) Profile 1149 98.0 569 97.1 580 99.0 

1101 93.9 553 94.4 548 93.5 


3. (38) Hand 


Object Assembly — Table 11. About 76 per 
cent of the successes on the man test are at the 
6 credit level, another 16 per cent earn 5 cred- 
its. The remaining 8 per cent earn 1 to 4 
points for this item. Credits of 8, 7, 6, and 4 
points account for the successful performances 
of nearly 80 per cent of our cases on the pro- 
file and hand tests. No significant sex differen- 
ces were found at any of the credit levels of 
the three items of this scale. 

The percentage of low partial scores in this 
scale is significantly greater among hospital 
patients than among other groups. This scale 
is one of the least reliable and least useful of 
the Bellevue battery in spite of the fact that 
the type of function it measures is one of the 
most valuable in clinical examinations. 





SUMMARY AND CONCLUSIONS 
1. An analysis of 1600 Wechsler-Bellevue 


records was made to determine the order of 
dificulty of the items of each subscale. 

2. Numerous clinically significant differen- 
ces were found between the order of items 


printed on the record blank of the Wechsler- 
Bellevue Scale and the actual percentages of 
successful responses from 1172 records used 
in the final analysis of difficulty. 

3. The greatest disagreement between 
Wechsler’s order and the results of our analy- 
sis occurred on the picture completion test. 
Other scales with significant discrepancies in 
item gradation were: comprehension, similari- 
ties, information, and vocabulary. 

4. When seven different diagnostic groups 
were compared, the item rank correlations be- 
tween the groups were consistently greater 
than were the correlations between Wechsler’s 
order and the order of any of the groups. 

5. In most scales striking differences be- 
tween the successes of males and females were 
found. Intratest scatter analysis must be based 
on the item ranks of that sex group to which 
the individual patient belongs. The derivation 
of masculinity-femininity ratios of some clini- 
cal value may be possible as a result of our an- 
alysis of sex differences. 

6. It is suggested that item analyses simi- 
lar to our own be made by other clinicians to 
determine the degree of agreement between 
item rank orders obtained by different examin- 
ers. If our results are confirmed by other re- 
searches, it might be expedient to change the 
order of test administration and to revise the 
record blank of the Wechsler-Bellevue Scale 
in conformance with clinical exigencies. Such 
a revision may result in increasing the overall 
usefulness of the scale in clinical practice. 
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AN ATTEMPT TO STUDY INTELLECTUAL 
DETERIORATION BY PREMORBID 
AND PSYCHOTIC TESTING’ 


SHELDON R. RAPPAPORT 
ALTON STATE HOSPITAL, ILL. 
AND 


WILSE B. WEBB 


WASHINGTON UNIVERSITY, ST. LOUIS 


HE question of mental deterioration in 

schizophrenics remains a highly contro- 
versial one. It has not been proved unequivo- 
cally that their intellectual dysfunction is 
either an apparent or a true inability. One 
school of thought, having such advocates as 
Shakow [8] and Arieti [1], contends that de- 
terioration in the schizophrenic is due to an or- 
ganic impairment and not to temporary or ex- 
trinsic factors. Opposing this viewpoint are 
such investigators as Kendig and Richmond 
[5], Layman [6], and Wittman [10], whose 
studies indicate that schizophrenic deteriora- 
tion is operational and not an essential loss. 
The apparent loss, they contend, is due to in- 
ability to sustain attention and effort, and to 
attitudinal factors of indifference and apathy ; 
the intellectual capacity remaining intact. 

It has been established by Rabin [7] and 
others that schizophrenics show a high varia- 
bility in intellectual performance on test and 
retest. The wide intratest scatter of schizo- 
phrenics is also well established [3]. Utilizing 
these scatter patterns and assuming that they 
represent differential resistance to deteriora- 
tion, Wechsler [9] proposes the utilization of 
his “hold” subtests as an index to the premor- 
bid intelligence of schizophrenics. Vocabulary 
has frequen‘ly been accepted as the best single 
index to the prepsychotic level of efficiency [9], 
but it also has been deemed inadequate in indi- 
cating that level in severe mental disorders 
[8]. Other Bellevue subtests which have 


1This study was made at Alton State Hospital by 
permission of Dr. Abraham Simon, Superintendent. 
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been shown to be significantly higher than the 
mean subtest score are Information, Compre- 
hension and Object Assembly [3, 4]. 

To date, the literature reports no attempt 
to measure quantitatively this loss in mental 
efficiency by prepsychotic and psychotic testing. 


STATEMENT OF THE PROBLEM 


In approaching the problem of mental de- 
terioration, the authors utilized a naive but 
defensible definition: “a decrement in test per- 
formance.” The task of this study was to 
measure the decrement in IQ points from pre- 
psychotic to psychotic testing. Specifically, this 
study was undertaken to determine: 

1. Whether the difference between pre- 
morbid and psychotic IQ’s would be significant. 

2. Whether selected “hold” subtests of the 
Wechsler-Bellevue would provide an accurate 
index to premorbid IQ. 

3. Whether attitudinal factors played a 
significant role in producing the apparent de- 
terioration. 


PROCEDURE 


Patients were selected for this study by two 
criteria: 

1. That they were unanimously diagnosed 
by the hospital staff as schizophrenic. 

2. That at some time during their junior 
or senior high school years they had had an 
IQ test. 

A statement of the test used, date of examin- 
ation, and test results were then obtained from 
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the respective Illinois public schools. During 
the six months in which this procedure has 
been followed, satisfactory data were obtained 
for only ten patients. The selection of cases 
by test availability resulted in only one male 
patient (Table 1). 

Before each patient in the group received 
any type of shock therapy, they were given the 
same form of the intelligence test that they had 


taken while in school. These tests were given 
on an individual basis, the examiner being care- 
ful that the patient apparently understood the 
instructions and successfully completed the 
practice exercises. 

Before scoring the test, the examiner rated 
the patient’s behavior on the Elgin Test Re- 
action Scale? (hence forward to be noted on 
ETRS) [10]. In this way his ratings were 











TABLE 1 
Test RESULTS AND CLINICAL DATA 
Months in Former Present 
Patient Age Diagnosis* Hospital Test IQ IQ TRS,° HQ? TRS, 
A 24 Sch, C. 33 O.H.,A 94 59 13 112 78 
B 27 Sch, C. 81 O.L,A 97 42 22 
Cc 28 Sch, P. 16 1AS.:2 92 40 55 
D 19 Sch, C. 28 O.1L,A 114 51 41 59 52 
E 25 Sch, M. 6 O.L,A 103 69 73 75 72 
F 18 Sch, M. 7 O.1., A 98 87 66 85 64 
G 15 Sch, A.R. 5 Cal-S.F. 82 83 76 79 74 
H 28 Sch, C. 4 H-N,A 99 76 70 94 79 
I 17 Sch, M. 2 H-N,A 84 55 7 81 75 
J 28 Sch, H. 41 O.HL, A 113 77 62 74 66 
Mean 22.40 97.60 63.90 





48.50 82.38 70.00 





*Schizophrenic; Catatonic, 


‘Includes Otis Higher and Intermediate Examinations, Form A; Illinois Gen. Intelligence Scale, Form 2; 
mon-Nelson High School Examination, Form A; and the Calif. Test of 





Paranoid, Mixed, Hebephrenic, and Acute Reaction types. 


Hen- 


Mental Maturity, Advanced °47 S-Form. 


°TRS: is the Elgin Test Reaction Scale given with the group test while TRS: was given with the Bellevue. 
4HQ is the “Hold” quotient: the sum of Vocabulary, Information and Comprehension weighted scores times 5/3 


to serve as an equivalent to the Verbal IQ. 


not biased by test results. The patient was also 
judged on the ETRS by another trained ex- 
aminer who observed the patient for a short 
time but did not administer to that patient any 
test. 

At a later date, but again before having any 
shock treatment, the patient was also given the 
Vocabulary, Information, and Comprehension 
subtests of the Wechsler-Bellevue, if the en- 
tire examination had not already been given to 
him. He was again rated on the ETRS at that 
time. 


RESULTS 


Table 1 shows the pertinent data on each 
of the ten patients. It will be noticed that 
patients B and C did not receive the Bellevue 
subtests. This is because they were inaccessible 
for testing due to extremely negativistic or ag- 


gressive behavior. 

The mean difference between former and 
present IQ’s was 33.7. The mean difference 
between the HQ®* and the former IQ was 
16. The statistic ¢ for the difference between 
the former and present I1Q’s was significant 
beyond the 1 per cent level of confidence, 
whereas ¢ for the difference between former 
IQ and HQ was only significant between the 
5 and 10 per cent levels of confidence. How- 

2This is an observational scale evaluating the 
areas of Motivation, Volubility and Relevancy of 
Speech, Emotional Reaction, Self-Confidence, De- 
cisiveness, Auto-Criticism, Attention, Interest, Wil- 
lingness, Social Confidence, and Effort. A score of 
70 or above indicates a degree of cooperation such 
that the term may be considered reliable. A score of 


60 to 70 indicates a questionable degree of reliabil- 
ity. 

®’The “hold” quotient, an extrapolated Verbal IQ 
obtained by taking 5/3 of the sum of the Vocabulary, 
Information, and Comprehension weighted scores. 
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ever, ¢ for the difference between the present 
IQ and the HQ was not significant even at the 
10 per cent level of confidence. 

The statistic rho (used because of the small 
sample) showed no correlation between pres- 
ent and former IQ’s (0.08), whereas a signi- 
ficant negative correlation was shown between 
the former IQ and the HQ (—0.60). When 
only the Vocabulary subtest instead of the HQ 
was used as equivalent to the Verbal IQ, a less 
significant but still negative correlation was 
shown (—0.33). When the Information and 
Comprehension subtests each were used to re- 
place the HQ, the negative correlation ranged 
from that of the Vocabulary quotient to that 
of the HQ. 

Rho for the present IQ and the ETRS 
given at that time indicated a substantial re- 
lationship (0.64). The relationship for HQ 
and the ETRS given with it was still higher 
(0.76). 


SUMMARY AND DISCUSSION 

Patients having an unequivocal diagnosis of 
schizophrenia and having had an IQ test dur- 
ing their school career were used in this study. 
The number obtained during the past six 
months was 10. These patients were given the 
same form of the IQ test that they had had in 
school and were also rated in behavior on the 
Elgin Test Reaction scale. They were later 
rated again on this scale and given the Vocab- 
ulary, Information, and Comprehension sub- 
tests of the Wechsler-Bellevue. The extra- 
polated quotient derived from the weighted 
scores of these subtests was designated the 
Hold Quotient. 

Even to this small sample, it is indicated 
that the mean loss in IQ points between pre- 
morbid and psychotic testings was significant. 
This loss seems to have been due to such atti- 
tudinal factors as inability to sustain attention, 
emotional disturbance, lack of conation, etc. 
This is supported by the high degree of rela- 
tionship between IQ and score on the Elgin 
Test Reaction Scale, which measures such atti- 
tudinal disturbances. It is further confirmed 
by the fact that when the HQ was strikingly 
higher than the Present IQ so was the patient 
rated much higher on the ETRS at the time of 


taking the Wechsler subtests than at the time 
of the paper and pencil test (viz. patients A 
and I, Table 1). Furthermore, just as the mean 
HQ was noticeably higher than the mean 
Present IQ so was the mean ETRS score ac- 
companying the HQ correspondingly higher. 
Thus, the intellectual disability found in the 
study seems to have been a function of attitud- 
inal disturbance. Therefore, the authors would 
maintain that in schizophrenics an “opera- 
tional” deterioration does exist, although it 
seems likely that this loss is due to attitudinal 
factors. 

The fact that the mean HQ was not signifi- 
cantly higher than the mean Present IQ sug- 
gests that even with the greater opportunity 
for adequate stimulation which is found in 
such a person to person relationship, the atti- 
tudinal disturbance of the schizophrenic can- 
not be completely overcome. It still imposes 
an intellectual penalty. 

Although the selected Wechsler subtests did 
resist the extrinsically induced deterioration 
better than the paper and pencil tests, they 
were not an accurate estimate of premorbid 
IQ. The extrapolated Vocabulary Quotient 
was the least inaccurate index of any of the 
single subtests or combination of subtests used 
in this study, but it was still highly inadequate. 

It may be concluded from this study that: 

1. There was a very significant and wide 
range of IQ loss in schizophrenics from the 
prepsychotic to psychotic level. 

2. The loss in intellectual efficiency seems 
closely related to factors such as attention, 
concentration, negativism, preoccupation, and 
apathy. 

3. The Wechsler subtests 
Comprehension, and Vocabulary, singly or in 


Information, 


any combination, are not reliable indeces to 
premorbid IQ. 

This study remains in progress. Because of 
the difficulty in obtaining suitable patients, 
comprehensive research could only be made 
with the cooperation of other institutions and 
public schools. However, it seems likely that 
additional data will only enhance and confirm 
the trends already indicated. The present 
study is offered in hope that further studies 
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may be inaugurated by others to clarify this 
important psychological issue. 
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THE STANDARDIZATION OF THE WECHSLER 
INTELLIGENCE SCALE FOR CHILDREN’ 


HAROLD SEASHORE, ALEXANDER WESMAN anv JEROME DOPPELT 


THE PSYCHOLOGICAL CORPORATION 


HE Wechsler Intelligence Scale for Chil- 

dren has grown logically out of the 
W echsler-Bellevue Intelligence Scales used 
with adolescents and adults [4]. In fact, most 
of the items in the WISC are from Form II 
of the earlier scales, the main additions being 
new items at the easier end of each test to per- 
mit examination of children as young as five 
years of age. 

Even though the materials overlap, the 
WISC is a distinct test from the Wechsler- 
Bellevue Scales and is independently standard- 
ized. The scales overlap in usefulness since 
both scales can be used with adolescents. How- 
ever, it is expected that the WJSC will be pre- 
ferred in testing adolescents up through the age 
of fifteen years. 

This new Children’s Scale (as it probably 
will come to be known in every-day clinical 
parlance) has been standardized with excep- 
tional care over a five-year period of experi- 
mental tryouts, field testing, and statistical an- 
alysis. In this paper some of the principal re- 
search data are reported. ‘ 

The WISC consists of twelve tests which, 
as in the Adult Scale, are divided into two 
subgroups identified as Verbal and Perform- 
ance. In the standardization, there were six 
tests in each of the subgroups: 

Verbal 
General Information 
General Comprehension 


Performance 
Picture Completion 
Picture Arrangement 


Arithmetic Block Design 
Similarities Object Assembly 
Vocabulary Coding 

(Digit Span) (Mazes) 


In the interest of shortening the time re- 
quired for examination, the Scale is to be ad- 


1This report of the standardization is an expan- 
sion of technical sections in the test manual: David 
Wechsler, Wechsler intelligence scale for children. 
New York: Psychological Corporation, 1949. Pp. 
113. 


99 


ministered ordinarily on the basis of only ten 
tests. For various statistical and practical 
reasons, Digit Span is considered an alternate 
test in the Verbal series and Mazes an alter- 
nate in the Performance series. As a matter 
of fact, the reasons for including Coding or 
Mazes are about equally good except that the 
Mazes take a considerably longer time than 
Coding and will probably, therefore, not be 
preferred. The conditions for using the alter- 
nate tests are described in the manual. 

The reader is referred to the manual and 
to Wechsler’s earlier text for a more complete 
description and discussion of the tests. 


THE STANDARDIZATION SAMPLE 


Age of Children and Size of Sample. The 
WISC was standardized on a sample of 100 
boys and 100 girls at each age from five 
through fifteen years. Each child was tested 
within one and one-half months of his mid- 
year ; e.g., the five-year-olds were past 5 years, 
4 months and 15 days but were not yet 5 years, 
7 months and 15 days. The feebleminded cases 
were exceptions as an adequate sample could 
not be secured without permitting more varia- 
tion; nearly all, however, were within two 
months of their mid-year. There were 1100 
boys and 1100 girls in eleven age groups, a 
total of 2200 cases. Actually more cases were 
tested, but the final sample includes those who 
best satisfied the other sampling requirements 
described below. Only white children were ex- 
amined. 

Variables of Sampling. It was determined at 
the beginning that for each age (as closely as 
practicable) and for the total sample, the se- 
lected cases should meet certain sampling re- 
quirements based on U.S. Census Bureau data 
for 1940, with some adjustment for the recent 
shift of population toward the West. 
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1. Areas. The states were divided into four geo- 
graphic areas as defined in Table 1. 

2. Urban-Rural. The urban-rural proportions of 
the total U. S. 1940 population were used as shown 
in Table 2. 


3. Parental Occupation. The children’s fathers 
were to be occupationally distributed similarly to 
all employed white males. The fourteen U. S. Census 
categories were reduced by combinations into nine, 
as shown in the footnote of Table 4. The quota for 
each geographic area was further defined in terms 
of Census reports on employment within each area. 


Drawing the Sample. With these controls, 
tables of requirements were set up for examin- 
ers in each area. It was not expected that the 
occupational and urban-rural requirements 
would be exactly satisfied in each area, but that 
the over-all conditions would be met by the 
national sample. 


Specific directions and worksheets for draw- 
ing a sample in a given school were provided 
so that cases would not be “volunteered” or 
“thrown in” at the whim of the examiner or 
school official. 


Most of the 55 feebleminded cases were ex- 
amined at the Illinois State School, Lincoln, 
Illinois; at Letchworth Village, New York; 
and at the Wayne County Training School, 
Michigan; a few selected cases from “special 
classes” of two public schools were included. 
The staff psychologists in the institutions aided 
in the selection of cases of the required ages 
who were rated as having IQ’s under 70 and 
not below 50. Cases where postnatal disease 
or accident were considered causative of the 
deficiency were omitted. The number of feeble- 
minded cases appearing in the regular school 
sampling was not determined. No examiner as- 
signed to public schools reported that any case 
was officially labelled as feebleminded. In all, 
then, 2.5 per cent of the total number of cases 
in the standardization population is known to 
be feebleminded. 

Analysis of the Obtained Sample. Tables 1, 
2, 3,4 and 5 present the sampling data on the 
2200 cases finally included in the standardiza- 
tion group. 

In Table 1 it is seen that the midwest sample 
(Area IT) is slightly short of cases; this was 
deliberate in order to increase the western pro- 
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TABLE 1 
SAMPLE BY GEOGRAPHIC AREA 
Per Cent 
in U.S. Wechsler Sample 
Population % N 
I New England and a Te yer Mie S. 
Middle Atlantic States 29.2 31.0 683 
II North Central States 32.7 28.9 635 
III South Atlantic and 
South Central States 26.8 26.5 583 
IV Mountain and Pacific 
States 11.3 13.6 299 
100.0 100.0 2200 
TABLE 2 
SAMPLE BY URBAN-RURAL RESIDENCE 
Per Cent eee wat 
in U.S. Wechsler Sample 
Population % N 
Urban . 67.9 60.3 1327 
neers oe FS 42.1 37.2 818 
Institutional*  ........................ — 2.5 55 
‘ ri 400.0 - 106 ).0 7 9 200 





*The 55 feebleminded cases from institutions were not 
reported as either rural or urban. 


TABLE 3 
PROPORTION OF URBAN AND RurAL CAsEs EXPECTED 
AND OBTAINED IN EACH AREA 








AREA 





~ URBAN 








RURAL 

Ex- Wech- Sam- Ex- Wech- Sam- 

pected schler ple pected schler Ple 
% % N % % N 
I 384 $4.5 458 16.2 23.4 191 
II $2.9 32.9 437 32.5 21.6 177 
III 17.2 17.5 232 40.0 42.9 351 
IV 11.5 16.1 200 11.3 12,1 ga 
100.0 100.0 1327 100.0 100.0 818 





portion in accord with wartime and postwar 
population shifts. All in all, the area sampling 
is eminently satisfactory. 


Table 2 reveals that in the national stand- 
ardization an appropriate number of rural 
children was included. Table 3 presents the 
urban-rural data in a different analysis. This 
table shows what percentage of the total sam- 
ple of urban cases was expected from each of 
the areas and what percentages were obtained. 
The rural sources are analyzed similarly. The 
only large differences are the relatively high 


rural contributions by Area I and a shortage 
in Area IT. 
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The values in Table 3 are slightly distorted population. “Bedroom villages’ attached to 
because the 55 feebleminded cases (2.5 per large cities were generally avoided. 
cent of all cases) were not allocated by resi- The statistical significance of the discrep- 
dence. ancies between expected cases and obtained 
The reader should also be reminded that cases could be computed, but this seems inap- 
the rural category includes all cases living on propriate because the expected percentages 
farms and in communities of less than 2500 themselves are quite imperfect criteria. Alloca- 


















































TABLE 4 
OCCUPATION OF FATHERS OF CHILDREN IN STANDARDIZATION SAMPLE 
Employed Wechsler Sample Wechsler Sample 
Occupational U. S. Males All Cases All Cases Boys Girls 
Groups* % %o N % % 
ee a ee 5.9 8.0 176 7.9 8.1 
ccsieisentnenelinieibienscdipiiiaaianepaennbiens 14.0 10.0 222 10.3 9.9 
Disatrtenesictitiminninmntvatiimeient 10.6 11.6 256 11.9 114 
13.9 12.7 280 12.5 12.9 
_ Ten 15.6 17.9 393 18.8 16.9 
ee ee ae ee 18.8 16.5 363 16.6 16.4 
7 6.0 5.5 122 5.5 5.5 
hichosstsstieiteiailienstonieicihdeabiaileaiiaiinai 14.5 13.8 303 12.4 15.2 
Weiddisciicthnibinnnstdinbbaanaiie 7 1.4 30 1.5 1.2 
( Feebleminded) ............................ —_ 2.5 55 2.5 2.5 
_, Ere One Sree 2200 1100 1100 
*A consolidation of 14 Census groups, 1940: i 
1. (land II) Professional and semiprofessiona!l workers 
2. (IIT) Farmers and farm managers 
3. (IV) Proprietors, managers and officials 
4. (V) Clerical, sales and kindred workers 
5. (VI) Craftsmen, formen and kindred workers 
6. (VII) Operatives and kindred workers 
7. (VIII, IX and X) Domestic, protective and other service workers 
8. (XI, XII and XIII) Farm laborers and foremen, and laborers 
9. (XIV) Occupation not reported 
TABLE 5 
OCCUPATION OF FATHERS OF CHILDREN IN STANDARDIZATION SAMPLE 
pe AREA I AREA II AREA III AREA IV 
Occupational Expected Obtained* Expected Obtained Expected Obtained Expected Obtained 
Groups % %o % %o %e %o % % 
1 7.0 8.6 5.3 6.0 4.7 10.6 6.8 7.0 
2 3.6 3.7 17.2 12.9 22.9 16.5 10.0 7.7 
3 11.1 11.2 10.0 10.3 9.8 13.2 12.0 14.4 
4 16.6 16.9 13.1 9.3 11.7 10.3 14.0 17.7 
5 18.1 18.2 15.4 22.8 12.7 16.1 16.0 13.7 
6 23.3 19.6 18.1 18.7 15.8 13.4 16.4 14.4 
7 7.7 7.9 5.3 4.1 5.1 3.4 8.8 8.7 
8 11.9 12.0 14.4 14.5 16.6 14.9 15.6 164 
9 38 1.8 7 1.5 7 1.5 4 — 
N 640 649 720 614 590 583 250 299 
(Feeblerminded) 34 21 — —_ 





*Obtained percentages computed on N without feebleminded cases. 
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tions based on 1940 Census data are the best 
criteria the authors could set for control of the 
sampling. But we cannot readily know wheth- 
er a discrepancy of, say, five percentage points, 
is due to obsolete Census data or to faulty 
sampling. For that reason one must be satisfied 
with reasonable approximations. There is 
reason to believe that there has been some 
trend toward urbanization since 1940, and 
that the five per cent shortage of rural cases 
is, therefore, an exaggeration as of 1947-1948. 


Tables 4 and 5 describe the sample by occu- 
pation of the father. Again, reasonable agree- 
ment between the Census expectancy and the 
actual sample is evident. The greatest shortage 
is in Occupational Group 2, farmers and farm 
managers. If one mentally combines the per- 
centages of Occupations 3 and 4, as seems logi- 
cal from the descriptions, the expected percent- 
age would be 24.5 and the obtained percentage 
24.3. Similarly, if one combines Occupational 
Groups 5 and 6, the expected percentage is 
34.4, which is identical to the obtained per- 
centage. A slight error in the obtained percent- 
ages occurs because the 2.5 per cent of feeble- 
minded cases were not allocated by parental 
occupation. In the last columns of Table 4, 
the analysis shows that the percentages of boys 
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and girls with respect to parental occupation 
are very similar. 

Table 5 presents the occupational sampling 
by area. The agreements are not as good here 
as for the national sample, but there are no 
gross miscarriages of sampling. 

The field examiners used utmost care in as- 
certaining the father’s occupation and were 
asked to write descriptions in considerable de- 
tail. The final classifications were made with 
the detailed Census descriptions at hand. Note 
that the field examiners were able to secure 
rather good samples in category 8, farm labor- 
ers and other laborers; this is usually a diffi- 
cult task. Similarly, the excess in Occupation 1 
(which is usually a category that one finds diffi- 
cult to keep small enough) is considerable only 


for Area III. 


The best available base for the selection of 
this sample was the occupational distribution 
as reported in the United States Census. How- 
ever, one should not take percentages derived 
from those data as being absolute criteria 
against which to select cases. There obviously 
have been shifts in occupational percentages 
because of the war, and also shifts in occupa- 
tional groups from area to area. Certain parts 
of the country have become more industrializ- 


TABLE 6 
CORRELATIONS OF EACH TEST WITH THE VERBAL, PERFORMANCE AND FULL SCALE SCORES 
100 Boys AND 100 Girts AT Eacu AcE 

















VERBAL* PERFORMANCE FULL SCALEf 
DO i iain 71/2 101/2 131/2 71/2 101/2 131/2 71/2 101/2 131/2 
Vee GE, «ncn -_-— a — 60 68 56 _ om om 
Information .~............. 64 82 -80 44 59 51 59 77 .73 
Comprehension -...... - .70 68 46 56 .37 54 69 58 
Arithmetic .................. 55 -70 59 46 57 38 -57 69 55 
Similarities 1090 .55 72 74 41 48 52 53 65 71 
Vocabulary ................ 66 $2 75 A768 51 a se 70 
Digit Span ............. 48 -50 44 AS 40 29 52 -50 42 
Performance Score ........ 60 68 56 _ _— —_ — — _ 
Picture Completion . .42 A5 38 34 48 55 43 51 51 
Picture Arrangement 51 58 43 51 53 51 58 62 53 
Block Design .............. 42 55 50 53 66 65 52 64 64 
Object Assembly ...... .38 38 31 59 52 68 52 47 52 
Coding|| ~........... ——— 42 42 32 35 42 35 43 48 
ee 43 40 51 55 39 46 53 ah 





*Sum of & tests, Digit Span omitted. 

tSum of & tests, Mazes omitted. 

¢8um of 10 tests, Digit Span and Mazes omitted. 
{Coding A at age 7 1/2; Coding B at ages 10 1/2, 18 1/2, 
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ed. There has also been a considerable shift in 
the west coast total population. 

The sampling requirements were defined in 
terms of the occupations of the fathers, and of 
urban-rural residence of workers, but this does 
not mean that children across the country are 
distributed in the same percentages. It is gener- 
ally known that rural families, laboring famil- 
ies, and perhaps southern families have more 
children than urban, upper middle class, and 
northern families. The data for making ad- 
justment in quotas are complicated and incom- 
plete. Modification of the percentages in each 
occupational category, and in the urban and 
rural categories to account for differential 
fecundity did not seem feasible. 


TABLE 7 
MEDIAN CORRELATION COEFFICIENTS BETWEEN TESTS 
AND VERBAL, PERFORMANCE AND FULL 





Perform- 
Verbal ance Scale 
Score Score Score 
Median r of 
EE TED icuitacensccihies 67 46 61 
Median r of 
Performance tests ..... 42 51 52 








INTERCORRELATIONS OF THE TESTS WITH VER- 
BAL, PERFORMANCE AND FULL SCALE SCORES 


Table 6 is constructed to show the relation- 
ship of each test with the three special scores. 
When a test is correlated with the composite of 
which it is also a contributing member (e.g., 
Vocabulary with the Verbal Score) a correc- 
tion [2] for spuriousness has been applied. The 
data are presented for three representative 
ages. In the manual more complete tables are 
printed showing the intercorrelations of the 
tests themselves. Table 7 may be helpful in 
comprehending the mass of coefficients in 
Table 6. 

The Verbal tests correlate more highly with 
the Verbal Score than with the Performance 
Score, and likewise the Performance tests cor- 
relate more highly with the Performance Score 
than with the Verbal Score. This is as one 
would expect. It does appear that the Verbal 
tests are somewhat more homogeneous as to 
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abilities measured. 

The correlations between the Verbal Score 
and the Performance Score are sufficiently 
high (.60, .68, .56 for the three ages) to indi- 
cate considerable common variance, yet are 
low enough to suggest that the abilities includ- 
ed in V and P cannot be readily inferred from 
each other. Both classes of abilities need to be 
tapped in an over-all appraisal of abilities. 

These data also indicate that the Digit Span 
is the least like the other Verbal tests; for this 
reason it was made an alternate. The Coding 
and Maze tests are about equally eligible to re- 
main in the Performance Scale, with the pre- 
ference going to Coding on the basis of ease of 
scoring and brevity. 


RELIABILITY 


The reliability coefficients of the individual 


TABLE 8 
RELIABILITY AND STANDARD Error OF MEASURE- 
MENT*® OF THE WISC Tests 








N = 200 ror EACH AGE LEVEL 
~~”: 172 = Agel01/2 Age131/2 
r SE r SEa Tr SEa 
Information .66 1.75 .80 1.34 82 1.27 
Comprehension -59 1.92 -73 1.56 71 1.62 
Arithmetic -63 1.82 84 1.20 7 1.44 
Comparison 66 1.75 81 1.81 -79 1.87 
Vocabulary .17 1.44 $1 .90 90 .96 
Digit Span .60 2.45 .59 1.92 -50 2.12 
Verbal Score .88 5.19 .96 3.00 .96 8.00 
(without Digit 
Span) 


Picture Completion .59 1.92 -66 1.75 .68 1.70 
Picture Arrange- 


ment 72 1.59 .71 1.62 .72 1.59 
Block Design 84 1.20 87 1.08 .88 1,04 
Object Assembly .63 1.82 .63 1.82 Tl 1.62 
Codingt .60 1.90 —- -- - 
Mazes -79 1.37 81 1.31 -75 1.50 

Performance 

Score 86 856.61 89 4.98 .00 4,74 
(without Coding 
and Mazes) 
Full Scale Score .92 4.25 95 8.36 .94 3.68 


(without Digit 
Span, Coding 
and Mazes) 





*SE» is in Scaled Score units for the tests and in 
IQ units for the Verbal, Performance and Full Scale 
Scores. 

+Based on correlating Coding A and Coding B, 115 
cases. See text for explanation. For age 8 1/2 the 
value is .56 for 91 cases. 
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tests and of the Verbal, Performance and Full 
Scale Scores are presented in Table 8 for ages 
744, 1014 and 1314. These three ages were se- 
lected for presentation as being probably most 
representative of the age range for which the 
Wechsler Intelligence Scale for Children is de- 
signed. Reliability coefficients have been com- 
puted by the split-half technique, with appro- 
priate correction for full length of the test by 
the Spearman-Brown formula. 

This technique could not legitimately be 
used for estimating the reliability of the Coding 
test, which is essentially a speed test; nor did 
the Digit Span test lend itself to such treatment 
because of its administration as two separate 
subtests—Digits Forward and Digits Back- 
ward. The reliability coefficients reported for 
Coding were made possible because, for age 
714 and 814, many of the children were given 
both Coding A and Coding B. (See adminis- 
trative manual for description of these tests. ) 
The reported values thus are based on an alter- 
nate test situation. The coefficients presumably 
would be a little higher if scores on Coding A 
were correlated with scores on a strict alternate 
form. The reliability coefficients shown for the 
Digit Span test are based on the correlation be- 
tween scores on Digits Forward and scores on 
Digits Backward corrected according to the 
Spearman-Brown formula. 

For the composite scores (Verbal, Perform- 
ance and Full Scale Scores) the sum of the 
scores on odd items in the contributing tests 
were correlated with the sum of the even items. 

The reliability coefficients presented in these 
tables should be carefully considered by the 
conscientious clinician when interpreting the 
scores earned on separate tests, or differences 
between scores. The smaller the reliability of a 
given score, the less confidence one can have in 
the judgments made concerning a child's true 
ability based on that particular test. Judgments 
with respect to differences between scores on 
two tests of moderate reliability must be made 
with considerable caution—the lower the re- 
liability of the scores, the more likelihood there 
is that the difference between them is due to 
chance rather than to any real difference in the 
abilities possessed by the child. As may be seen 
by reference to the reliability table, this cau- 
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tion is more necessary for some tests than for 
others. It is least necessary when working with 
the composite Verbal, Performance and Full 
Scale Scores, which are highly reliable. 

As another statement of the stability of 
scores on the Wechsler Intelligence Scale for 
Children, Table 8 presents the standard error 
of measurement by test and age. This measure 
indicates the band of error which surrounds 
the child’s test score. Thus, a SE, of 1.75 for 
7\/-year-olds on Information indicates that 
the chances are about two out of three that a 
true score on this test is within 1.75 points of 
the obtained scaled score. One can be highly 
certain that the true score is within 5.25 points 
of the obtained score (5.25 is three times the 
SE, of 1.75). Note that there are considerable 
differences in size of SE,, from test to test. For 
example, confidence in the stability of Block 
Design for 71/-year-olds is permissible within 
limits of £1.20 (chances two out of three) 
and +3.60 (high certainty). Obviously, the 
smaller the SE,,, the less allowance one needs 
to make for unreliability of the score. Differ- 
ences between Block Design and Vocabulary 
scores are less likely to be due to chance than 
are differences between scores on Object As- 
sembly and Comprehension. These facts call 
for special wariness in attempts to compare 
differences between test profiles. 

The reader of Table 8 should not be con- 
fused by the discrepancy between the size of 
the SE,, for the individual tests, as contrasted 
with the SE,, for Verbal, Performance and 
Full Scale 1Q’s. For individual tests, the SE,, 
is in scaled score units; for the {Q’s the SE, is 
in IQ units, which are the ones in which most 
test users are interested. Thus, the SE, of 
5.19 for the Verbal IQ of 71/4-year-olds indi- 
cates that the true IQ is probably (chances 
two out of three) within 5 points of IQ of the 
obtained IQ. 


SCALED SCORES 


The scaled scores have been so derived as to 
provide, at each age and for each of the separate 
tests, a mean scaled score of 10 and a standard 
deviation of 3. This was accomplished by pre- 
paring a cumulative frequency distribution of 
raw scores for each test at each age level and 
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setting each percentile point at its appropriate 
standard score value on a theoretical normal 
curve with a mean of 10 and standard devia- 
tion of 3. Scores for all ages on a single test 
were then listed in parallel columns and minor 
irregularities in the progression of scaled score 
equivalents from age to age were smoothed. 
The assumption that these irregularities were 
chance results of population sampling seemed 
to be the only tenable position. Few instances 
of such minor deviations were found, and the 
Scales are essentially a direct translation from 
raw scores to a normalized distribution of 
scaled scores with a mean of 10 points and 
standard deviation of 3 points. 

The manual presents 33 tables for convert- 
ing raw scores on each test into scaled scores. 
The tables for the mid-years (age of testing) 
were first made. Then tables for each four- 
month span were constructed by interpolation ; 
thus there are tables for 6-0 through 6-3; 6-4 
through 6-7, 6-8 through 6-11, etc. After se- 
curing raw scores on each test, the examiner 
converts them to scaled scores by using the ap- 
propriate age table for the subject. These 
scaled scores, then, are the basis for determin- 


ing the IQ. 


THE DEVIATION INTELLIGENCE QUOTIENT 


One of the most important innovations in 
the standardization of the present Scale is that 
IQ’s are obtained by comparing each subject’s 
test performance not with a composite age 
group but exclusively with the scores earned 
by individuals in a single (that is, his or her 
own) age group.” With one stroke, the devia- 
tion IQ method cuts away much of the under- 
brush which has encumbered the problem of 
the variability of the individual’s IQ. By keep- 
ing the standard deviation of IQ’s identical 
from year to year, a child’s obtained IQ does 
not vary unless his actual test performance as 
compared with his peers varies; if the stand- 
ard deviations were not made identical, a 
child’s obtained IQ might vary considerably 
from year to year, even though his relative abil- 
ity remained constant. Apart from test unre- 
liabilities, IQ’s obtained by successive retests 

2The deviation IQ concept has been similarly em- 


ployed in some group tests, notably the Otis Tests 
and Pintner General Ability Tests. 
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with the WISC automatically give the sub- 
ject’s relative position in the age group to which 
he belongs at each time of testing. If any 
changes are observed they may be ascribed to 
changes in the subject and not in the structure 
of the test nor its standardization, since in IQ 
units the standard deviations as well as the 
means of all age groups are identical. It is no 
longer a matter of discovering how many 
children test above or below a given IQ in the 
population, since the deviation IQ is by defi- 
nition dependent on the normal distribution of 
the test scores. 


Each person tested is assigned an IQ which, 
at his age, represents his relative intelligence 
rating. This IQ, and all others similarly ob- 
tained, are deviation IQ’s since they indicate 
the amount by which a subject deviates above 
or below the average performance of individu- 
als of his own age group. The IQ of 100 on 
the WISC is set equal to the mean total score 
for each age, and the standard deviation is set 
equal to 15 IQ points. In terms of percentile 
limits, the highest one per cent will have IQ's 
of 135 and above, and the lowest one per cent 
IQ’s of 65 and below. The middle fifty per 
cent of children at each age will have IQ’s 
from 90 to 110. 

The IQ tables were constructed as follows: 
For each age the five Verbal scaled scores for 
each subject were summed and a mean and 
standard deviation of such sums computed. 
These sums were transformed into a distribu- 
tion of 1Q’s with a mean of 100 and a stand- 
ard deviation of 15. The same process was fol- 
lowed for translating the sums of the five Per- 
formance scaled scores to an IQ scale with a 
mean of 100 and an SD of 15. The IQ’s 
based on a Full Scale Score of ten tests were 
similarly determined. 

One set of three IQ tables (/’, P, and FS) 
suffices for all ages since the process of scaling 
each test for each age resulted in similar means 
and sigmas of the sums of five or ten tests at 
all ages. This was not a fortuitous result, but 
is one that should be expected because each of 
the tests was standardized for each age so that 
the raw scores are converted into scaled scores 
with a mean of 10 and a standard deviation of 


i 
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INTELLIGENCE QUOTIENTS AND AGE 


Having carried’ on all the standardization 
processes described above, a test of the statisti- 
cal transformations from the original responses 
of the child to the final 1Q’s is whether the 
mean I1Q’s and their standard deviations for 
all ages approximate 100 and 15, respectively, 
when each subject’s scores are now converted 
to 1Q’s. Table 9 shows the data for boys and 
girls and for the three Scales. 


TABLE 9 
THe Mean AND SD oF IQ’s on THE THREE SCALES 
BY AGE AND SEX 
100 Boys AND 100 Girts aT Eacu AGE 








PERFORM- 











VERBAL ANCE SCALE 
Age* Mean SD Mean SD Mean SD 
Boys 
5 99.5 13.8 98.6 16.9 99.0 15.4 
6 100.6 14.8 98.9 16.0 99.7 15.7 
7 99.6 14.2 100.0 15.2 99.8 14.6 
8 101.7 165.5 102.1 165.4 102.1 156.5 
9 99.2 16.2 99.2 16.9 99.1 17.1 
10 102.1 15.2 101.7 15.3 102.1 15.5 
11 102.0 16.4 99.8 15.4 101.1 16.1 
12 102.4 16.7 101.6 15.3 102.1 16.5 
13 101.9 14.7 100.6 15.2 101.4 14,1 
14 101.6 16.6 100.7 15.1 101.3 15.9 
15 102.4 13.5 100.4 14.5 101.6 13.7 
All 101.2 15.3 100.3 15.6 100.8 15.6 
Girls 
5 99.8 13.4 101.0 13.9 100.4 13.2 
6 100.6 18.9 101.4 13.4 101.0 13.6 
7 100.6 11.7 100.9 12.0 100.7 11.4 
8 97.8 15.8 97.9 15.3 97.6 15.8 
9 100.4 13.0 100.4 14.0 100.4 13.1 
10 98.4 16.7 98.4 13.8 98.2 16.3 
11 97.9 14.5 99.6 13.8 98.6 13.8 
12 97.5 14.7 100.0 14.4 98.5 14.5 
13 98.3 16.2 98.7 15.8 98.3 15.9 
14 97.6 14.3 100.2 15.3 98.7 14.4 
15 97.3 16.4 97.9 15.4 97.4 16.5 
All 98.7 14.7 99.7 14.4 99.1 14.4 
Boys and Girls 
(N = 2200) 100.0 165.1 100.0 15.0 100.0 15.0 





*Read these ages as 514, 6%, etc. 


The bottom line of values shows that for all 
cases—2200 in all—the requirement is exactly 
met. However, for boys and girls separately 
and for different ages, small discrepancies oc- 
cur. The age-to-age discrepancies are caused in 
part by the fact that to secure one IQ table 
for all ages small discrepancies in means and 
standard deviations of scaled scores for the 
eleven ages were eliminated by averaging. It 
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was assumed that these discrepancies were due 
to sampling and that smoothing of the data in 
making scaled scores and the IQ tables was 
justified. 


However, the sex differences are not so easy 
to explain. Scaled score and IQ tables were 
made by treating boys and girls as members of 
one sample. The mean IQ’s, however, show 
that boys in the standardization sample gen- 
erally were slightly superior to girls. ‘The su- 
periority is primarily in the older ages, and 
the differences are small. On the Verbal Scale, 
the boys excel the girls by more than three 
points at ages 8, and 10 through 15. On the 
Performance Scale, the difference favors the 
boys by more than three points at ages 8 and 
10, and the girls are ahead at ages 5, 6, 7, and 
9. On the Full Scale, boys have higher 1Q’s 
than girls by 2.5 to 4.5 points at 7 ages, while 
girls are ahead by smaller amounts at four 
ages. 

How shall one interpret these sex differ- 
ences? Three explanations come to mind: 


A. The tests are fair to both boys and 


TABLE 10 
DISTRIBUTION OF IQ’s OF RURAL AND 
URBAN CHILDREN 





FULL 











PERFORM- 
VERBAL ANCE SCALE 
IQ Rural Urban Rural Urban Rural Urban 
150-154 1 
145-149 3 1 2 
140-144 7 2 2 
135-139 2 8 3 5 3 7 
130-134 9 16 2 12 2 16 
125-129 13 40 14 45 10 41 
120-124 22 78 29 82 27 69 
115-119 37 115 52 101 46 117 
110-114 67 143 92 180 58 167 
105-109 85 189 81 159 102 195 
100-104 110 211 137 193 122 194 
95- 99 130 170 93 160 119 194 
90- 94 116 150 103 172 119 134 
85- 89 94 96 89 108 68 87 
80- 84 69 67 35 38 66 55 
75- 79 31 22 50 42 39 28 
70- 74 15 7 23 14 25 12 
65- 69 11 1 6 10 6 5 
60- 64 6 2 S 3 4 1 
55- 59 1 1 1 2 1 
N 818 1327 818 1327 818 1827 
Mean 97.3 103.3 98.56 102.5 97.6 108.2 
sD 18.5 13.4 13.8 18.5 13.5 12.9 
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girls, and boys actually do excel girls, especial- 
ly at the later ages. 

B. Boys and girls are the same in mental 
ability, but the chosen test items turned out to 
be slightly biased in favor of the boys. 

C. Again, assuming that general ability 
is not sex differentiated, the sampling of boys 
was somehow chosen with a slight bias. 

The data at hand do not permit a resolution 
of these three choices. The safest assumption is 
that factors described in (B) and (C) are in- 
volved. Terman and Merrill [1, 3] found the 
same situation in their 1937 Revision of the 
Stanford-Binet examination, and likewise could 
find no definitive answer from their data. 

All in all, the preliminary studies leading to 
inclusion of test items and the sampling itself 
were fortunate enough to result in mean IQ’s 
of boys and girls which are essentially equal. 
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For all practical purposes the clinica! examiner 
can ignore sex differences. A difference in mean 
scores of three points, for example, is really a 
plus and minus difference of 1 '4 points from 
the actual norms based on both sexes. 


INTELLIGENCE QUOTIENTS OF RURAL AND 
URBAN CHILDREN 


Table 10 distributes the 1Q’s of urban and 
rural children for all ages, the 55 known 
feebleminded cases being excluded. As has been 
found in many researches, urban children score 
higher on mental tests. The Full Scale differ- 
ence in the standardization sample is 5.6 points. 
The Verbal Scale difference is 6 points, and 
the Performance Scale difference is 4 points. 
Terman and Merrill report an urban-rural 
differential of 6.5 points for the 1937 Revis- 
ion of Stanford-Binet. 


TABLE 11 
DISTRIBUTION OF VERBAL IQ’s For EACH OCCUPATIONAL GROUP, 


FOR THE FEEBLEMINDED Group, AND FOR ALL CASES 





‘Verbal 











Father’s Occupation* 

IQ 1 2 3 4 5 6 7 8 9 FM All 
| Serene 1 1 
ee 1 1 1 3 
he SRE eee en + 1 1 1 7 
Ls 3 4 2 1 10 
a 6 3 a 6 6 25 
0 19 a + 7 9 7 1 2 53 
Se 19 9 23 14 9 15 5 3 3 100 
ae 20 11 24 26 30 20 6 12 3 152 
a 20 20 41 42 39 27 5 15 1 210 
a 20 19 39 44 56 44 17 32 3 274 
a 29 21 40 48 59 57 19 46 2 321 
SS 17 26 25 45 57 62 16 44 8 300 
_ ee 8 35 21 24 55 60 23 39 1 266 
a 6 29 17 11 39 31 11 42 + 190 
_ | 2 24 9 6 21 23 12 37 2 136 

75- 79... 1 12 3 2 7 9 2 16 1 2 55 
iy, SRE 2 2 7 + 3 7 8 30 
ne | 4 1 2 4 1 9 21 
60- 64... 1 2 2 1 2 8 16 
a 1 1 14 16 

a 9 9 
on , 4 4 

44 and below.......... 1 1 

N 176 222 256 280 393 363 122 303 30 55 2200 
Mean 110.9 96.8 105.9 105.2 100.8 988 976 94.6 100.6 59.6 100.0 
SD 14.0 14.6 12.9 117 126 122 1238 12.8 16.1 8.1 15.1 





*See Code in Table 4. 
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INTELLIGENCE QUOTIENTS AND FATHER'S 
OCCUPATION 


Distributions of Verbal, Performance, and 
Full Scale 1Q’s are given in Tables 11, 12, and 
13 for each of the nine occupational categories, 
and for the known feebleminded cases. Particu- 
lar interest attaches to the differences in mean 
scores of these various groups. Table 14 ap- 
proximates the data for the full scale. 


These differences in mean 1Q’s for occupa- 
tional groups are considerable but not as great 
as those reported by Terman and Merrill for 
the 1937 Revision. The categories are not ex- 
actly comparable. Group A corresponds fairly 
closely to Terman and Merrill’s groups I and 
II, for which they give eight mean IQ’s for 
different age groupings, the median of these be- 
ing about 115 as compared with 110 from the 
Wechsler data. For several age groupings in 
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their category IV, rural owners, the median of 
the mean IQ's is about 94 as compared with 
97 for Group E. Their group VII and the 
Wechsler group F are comparable, and yield 
mean 1Q’s of about 97 (median for mean 1Q’s 
of four ages) and 94, respectively. 


For the lower socioeconomic groups, the two 
tests yield similar mean IQ's of the order of 95. 
The somewhat higher mean IQ of Terman 
and Merrill’s higher sociometric groups, 115 
vs. 110, may be accounted for, in part, by the 
greater verbal loading in the Stanford-Binet 
tests. For WISC, it is noted in Tables 11 and 
12 that socioeconomic differences are greater 
on the Verbal Scale than on the Performance 
Scale. 


Whatever other social implications one may 
consider, these mean differences between 
groups should not be allowed to overshadow 


TABLE 12 
DISTRIBUTION OF PERFORMANCE IQ’s For EACH OCCUPATIONAL GROUP, 


FOR THE FEEBLEMINDED GROUP, AND FOR ALL CASES 























Performance Father’s Occupation 
10 1 2 3 4 5 6 7 . 9 FM _sAIiI 
i ctnihannicabesiane 
ee 1 1 
Ee 1 1 2 
5400... 1 4 1 1 1 8 
I assis 3 1 3 3 3 1 14 
a 16 2 8 5 10 12 4 2 59 
ee 13 11 17 19 17 21 3 9 1 111 
| 22 12 26 26 27 20 7 11 2 153 
a 31 26 40 48 60 30 7 24 6 272 
Ee 21 21 43 38 45 36 12 22 2 240 
ne 20 41 32 45 69 57 17 43 6 330 
3 ea 18 22 30 34 44 47 15 42 1 1 254 
| 15 24 24 29 48 57 27 47 4 275 
ee - 8 23 15 18 = g 36 37 10 47 3 2 199 
Ber iintnresicemene 1 11 6 8 14 16 5 11 1 1 74 
en 4 17 6 10 17 8 28 2 3 95 
a 1 8 1 3 a 8 3 8 1 5 42 
on 1 1 1 1 2 2 4+ 3 1 9 25 
a 1 1 3 1 5 11 22 
a 1 7 8 
a 7 7 
| ee 6 6 
44 and below...... 3 3 
N 176 222 256 280 393 363 122 303 30 55 2200 
Mean 107.8 98.6 105.3 104.3 1016 995 969 949 98.3 61.6 100.0 
SD 13.4 13.9 12.9 12.2 13.0 13.5 13.7 13.3 13.6 12.2 15.0 








STANDARDIZATION OF WECHSLER SCALE FOR CHILDREN 


the fact of great overlap in the distributions of 
1Q’s for the various occupational groups. 


THE FEEBLEMINDED 


Wechsler classes 1Q’s under 70 as evidence 
of feeblemindedness. On the basis of a mean of 
100 and a standard deviation of 15, 2.2 per 
cent of cases should have IQ’s below 70. This 
is a statistical definition ; it says that arbitrarily 
2.2 per cent of the cases are feebleminded. ‘The 
clinical and social significance of feebleminded- 
ness is not defined by these numbers except 
that if this percentage is very far away from 
the proportion who are actually classified as 
feebleminded in clinical practice, the statistical 
concept will have to be changed. Roughly, the 
test author considers that about 3 per cent of 
the population could well be classified as 
feebleminded by a test ; this seems to be reason- 
able in the light of actual practice. 
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In the standardization it was decided to let 
about 2.5 per cent of the sample be made up of 
known feebleminded cases. The resultant dis- 
tribution of IQ’s of these 55 cases is shown in 
the columns headed FM in Tables 11, 12, and 
13. Some of these 55 institutionalized 
school-identified feebleminded cases tested 
above 70. On the V and P Scales, 10 and 12 
cases, respectively, tested over 70; but when 
V and P scores were made into Full Scale 
scores for these subjects, only 4 showed up as 
having I1Q’s over 70 on the Full Scale. 


and 


Some children in the general sample tested 
below the level set for feeblemindedness; in 
the Full Scale the number is 19, which, added 
to the 51 cases (55—4), equals 70 cases, or 
3.2 per cent of the total sample. 


In making the IQ scales on a deviation basis 
it is possible to assign 1Q’s as low as the sum 


TABLE 13 


DISTRIBUTION OF FULI 








ScALeE IQ’s For EACH OCCUPATIONAI 


GROUP, 








FOR THE FEEBLEMINDED GROUP, AND FOR ALL CASES 

Full Scale Father’s Occupation 

IQ 1 2 3 4 5 6 7 8 v) FM All 
150-154 - ; 

145-149 1 1 2 
140-144 inhabits 1 1 2 
135-139 : 2 2 2 1 1 1 1 10 
130-134 7 1 + 2 3 l 18 
125-129... 15 2 12 9 5 7 1 51 
120-124... 23 8 16 15 i+ 11 1 6 2 96 
115-119. 23 17 26 27 30 21 S 8 3 163 
(|) 26 14 43 35 48 28 12 16 3 225 
105-109 : 18 24 40 61 63 46 14 28 3 297 
100-104 nei 24 27 39 47 61 62 11 43 2 316 
95- 99..... nee 18 28 37 35 59 58 22 51 5 313 
90- 94 iatiaiheiaadia 8 34 11 29 39 58 22 49 3 253 
85- 89 Shalie 6 21 9 9 31 32 11 34 2 155 
80- 84 2 23 10 6 23 16 12 26 3 1 122 
> re 1 12 6 2 10 13 1 21 1 2 69 
70- 74 1 7 4 7 7 10 1 1 38 
eae 2 1 7 1 8 19 
| Ee 1 1 3 6 11 
| Ce 1 2 14 17 
SR eee oe il 11 
45- 49.00. ansbe 6 6 
44 and below....... 2 4 6 
N 176 222 256 280 393 363 122 303 30 55 2200 
Mean 110.3 97.4 106.2 105.2 101.3 99.1 97.0 94.2 99.5 56.6 100.0 
SD 13.3 14.0 12.4 11.1 12.4 12.2 12.5 12.8 15.2 9.5 15.0 
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TABLE 14 
APPROXIMATE WECHSLER IQ’s FoR OCCUPATIONAL 
CATEGORIES 





Wechsler IQ’s 








A. (1) Professional and semi- 








professional workers -......... 110 
B. (3) Proprietors, managers and 
officials, and 
(4) Clerical, sales and 
kindred workers .................. 105+ 
C. (5) Craftsmen, foremen and 
kindred workers, and 
(6) Operatives and kindred 
workers 100 
D. (7) Domestic, protective and 
other services workers ....... 97 
E. (2) Farmers and farm 
managers 97 
F. (8) Farm laborers and fore- 
men, and laborers ............. 94 





of the scaled scores obtainable. It was felt that 
IQ’s under 45 would not be discriminatively 
meaningful, so the IQ tables stop at that point. 
Persons who have scaled scores yielding 1Q’s 
below 45 can be recorded as “44 or below.” 


THE SUPERIOR 


No attempt was made to isolate a unique 
group of very superior children as was done 
with the feebleminded. It was assumed that 
the superior children were in the school systems 
and would enter the sample in proper sampling 
proportions. On the Full Scale, 1.5 per cent 
tested above an IQ of 130, whereas 2.2 per 
cent were expected. On the Verbal Scale the 


~ Approximate ~ 
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percentage was 2.1 per cent, and for the Per- 
formance Scale, 1.1 per cent. In the published 
IQ tables the highest IQ assigned is 156, but 
children whose scaled scores are higher than 
those required to attain this IQ can be recorded 
as being 156 or above. Differentiation above 
this point probably is not necessary. 


To examiners who have been accustomed to 
secure 1Q’s on the order of 20 to 25 or 170 to 
180, lack of very low and very high IQ’s on 
WISC may at first be a little disturbing. They 
should be reminded that the range of IQ’s on a 
deviation scale can be quite arbitrary. For ex- 
ample, if the mean were set at 100 and the 
standard deviation at 20 (instead of 15), low- 
er and higher 1Q’s would be secured arbitrari- 
ly. ‘The reason for setting the standard devia- 
tion at 15 is that it approximates the empirical 
standard deviation of about 16 secured by Ter- 
man and Merrill by an age-scale method. With 
standard deviations so similar, WISC will ap- 
proximate in meaning (as far as size of the 
number is concerned) the IQ’s secured by the 
Stanford-Binet Revision. 
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CHANGES IN PERFORMANCE ON THE ROSENZWEIG 
PICTUREFRUSTRATION STUDY FOLLOWING 
EXPERIMENTALLY INDUCED FRUSTRATION 


ROBERT L. FRENCH 


NORTHWESTERN UNIVERSITY 


HE Rosenzweig Picture-Frustration 

Study [2, 5] is a projective test aimed 
at determining the nature of a subject’s reac- 
tions to frustrating situations. The relative 
ease with which it can be administered and 
scored suggests that it may have a wide area of 
usefulness, provided its validity can be estab- 
lished. Available indications of validity are 
suggestive but not overwhelming. Rosenzweig 
and Sarason [3] found moderate correlations 
with measures of hypnotizability and of re- 
pression (memory for failures), and Sinaiko 
[8] obtained some significant correlations 
with measures of job efficiency in a group of de- 
partment store section managers. Several case 
studies have yielded further indications of 
agreement with the data from other tests [4, 


6, 7]. 


Evidence of validity may be derived not only 
from correlations of test scores with other 
measures for a series of individuals, but also 
from theoretically comprehensible changes in 
standing on the test when conditions affecting 
the subject are varied experimentally. If the 
test measures frustration response, then either 
inducing or reducing frustration should in- 
fluence test scores, provided it can be assumed 
that varying the degree of a subject’s frustra- 
tion strengthens the tendency to make one kind 
of response as against another, and does not 
simply affect the intensity of all response ten- 
dencies equally. This qualifying assumption is 
necessary because the P-F test is scored for the 
frequency of responses of various kinds, with 
the total number of responses fixed; thus a 
change in score in any category necessarily in- 
volves a change in some other category. Con- 
sideration of the nature of the test and of the 
verbal behavior of many individuals in frus- 
trating situations suggests that this assumption 
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should hold to a considerable extent. 

The present group experiment with North- 
western University students was designed to 
ascertain the effects of experimental variations 
in frustration upon scores in each of the P-F 
scoring categories, using false reporting of ex- 
amination grades as a means of varying frus- 
tration.” 


PROCEDURE 


The 115 students in a social psychology class 
were first given the P-F test as a group. Three 
weeks later they took an essay examination 
covering course material. On this examination 
about half of the students earned A’s and B’s 
(“good” students) and about half, C’s and D’s 
(“poor” students). Grades on the papers of 
half of the good students were then juggled so 
as to reduce them by two letter grades (4’s to 
C’s, and B’s to D’s), while those on half of the 
poor papers were increased by two letter 
grades. Immediately after returning the papers 
to the class, the P-F test was re-administered. 
The final procedure was to have the subjects 
describe in writing their reactions to the grades 
they had received. 


Table 1 shows the essentials of the experi- 
mental design. The four subgroups are num- 
bered arbitrarily to facilitate identification. 
Each group comprised 20 subjects, matched 
approximately for distribution of sex and of 
total Impunitive responses on the initial test. 
Groups 1 and 2 were matched exactly for dis- 
tribution of grade earned; so were Groups 3 
and 4. The reduction in number from the orig- 
inal total of 115 to the final total of 80 result- 
ed from: matching on the variables just men- 


1Thanks are due to Miss Betty J. Pickett for assis- 
tance in scoring the tests, and to Dr. Elizabeth G. 
French for her help with the statistical computations. 
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TABLE 1 
OUTLINE OF EXPERIMENTAL DEsIGN 





Grade Earned 
CorD 
(“Poor” students) 


AorB 


wi Group 2 Group 4 
nol n 20 n= 20 
o € : 
t < Grade raised Grade same 
@ 
loos 
2 = - 
= = Group 1 Group 3 
oO be n= 20 n- 20 
oO Grade same Grade lowered 


tioned ; eliminating cases with more than one 
unscoreable response on the initial test ; subject 
absenteeism ; and random discards necessary to 
equalize the subgroup n’s. ; 

It was assumed originally that the grade 
earned could be taken as a rough measure of the 
grade expected, and that expectations would 
on the average be proportional to aspirations 
for all groups. On this basis, the good students 
were assumed to have high aspirations, and the 
poor students, low aspirations. Since for all 
groups aspirations should probably be higher 
than expectations, some frustration was expect- 
ed in Groups 1 and 4 in which grades were not 
changed; but it was anticipated that by con- 
trast Group 3 would be severely frustrated, 
and Group 2 “gratified.” There should thus 
be two principal comparisons to make, that be- 
tween Groups 1 and 2, and that between 
Groups 3 and 4. In a two-way table, the over- 
all effects of the differences expected should 
appear as interaction. 


Some check on the effectiveness of the pro- 
cedure in inducing or reducing frustration was 
afforded by the subjects’ statements following 
the final test. In Group 1, the comments of 16 
of the 20 subjects evidenced frustration, disap- 
pointment, or a kind of defensive indifference. 
In Group 2, 16 subjects said that they had ex- 
pected a lower grade, and the comments of all 
20 suggested degrees of satisfaction ranging 
from complacency to wild elation. In Group 3, 
all subjects reported having been frustrated, 
disappointed, or angrily incredulous.? In Group 


2In the hope of encouraging cooperation on the 
part of all subjects, the correct grades for this group 
were announced before these reports were secured. 
Hence the statements from this group may not be 
directly comparable with the others. 





FRENCH 


4, the subject’s comments were about evenly 
divided between restrained satisfaction and 
moderate frustration (for example, at having 
missed an A by one point). On the whole, then, 
these reports tended to support the original as- 
sumptions, more strikingly with reference to 
the contrast between Groups 1 and 2, less so 
but still definitely as regards Groups 3 and 4. 


RESULTS 

The data were analyzed separately for each 
of the 15 possible scoring categories available 
in the P-F test. These include the 9 possible 
combinations of direction of response ( Extra- 
punitive, Intropunitive, and Impunitive) and 
type of response (Obstacle-Dominant, Ego- 
Defensive, Need-Persistent), the totals for the 
three directions, and the totals for the three 
types. The scores employed in each category 
represented the number of responses rather 
than the corresponding percentage of total re- 
sponses used by Rosenzweig. 

The data for each scoring category were 
subjected to analysis of covariance in order to 
control the effects of group differences on the 
initial test and to increase the precision of the 
final error estimate. Table 2 shows for each 


TABLE 2 

INITIAI Mean, ApyusTep FinAL MEANS, 

AND F’s OBTAINED IN TESTS OF VARIATION 
BETWEEN GROUPS 


GENERAI 


Adjusted Final Means 

















Scoring Ger al for Groups 
Category Mean 1 2 3 4 F 
Extrapu- 0-D 1.7 ) 4.48 1.77 1.30 1.48 0.67 
nitive E-D 6.05 7.23 6.85 65.61 6.81 1.29 
(E) N-P 2.14 2.48 2.08 2.53 2.07 1.21 
Intropu- O-D 1.4 511.24 «1.22 1.80 0.52 
nitive E-D 61 1.78 2.68 2.47 2.63 4.37%* 
(1) N-P 97 $.02 3.15 3.86 3.64 2.78* 
Impu- O-D 1.46 1.66 1.48 1.42 1.84 0.58 
nitive E-D 3.74 2.96 3.85 351 3.62 1.67 
(M) N-P 1.66 1.56 1.42 1.57 1.71 0.82 
Direction E 9.90 11.24 10.26 9.53 9.61 1.91 
Totals I 7.01 6.87 6.97 1.63 1.54 2.73° 
M 6.85 6.19 6.64 6.50 6.70 0.28 
Type O-D 4.59 465 4.82 3.99 4.20 0.79 
Totals E-D 12.40 12.01 12.72 11.84 12.43 1.06 
N-P 6.77 17.07 6.60 17.98 17.44 2.82 

*Significant at 5% level. F needed for 3 and 75 d.f. 
is 2.73. 

**Significant at 1% level. F needed for 3 and 75 df. 
is 4.06. 
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category the initial general mean, the final 
mean in each group adjusted for the regres- 
sion of final on initial scores, and the F ob- 
tained in testing the significance of the differ- 
ences among the adjusted means of the four 
groups (test of “Between Groups”). Differ- 
ences significant beyond the 5 per cent level 
may be noted in three categories,—Intropuni- 
tive Ego-Defensive, Intropunitive Need-Per- 
sistent, and Intropunitive Total. The F’s ap- 
proach significance in the case of several other 
categories, —Extrapunitive Total and Need- 
Persistent ‘Total. 


TABLE 3 
F’s OBTAINED IN FuRTHER ANALYSIS OF VARIATION 
BETWEEN Groups IN CATEGORIES SHOWING 
SIGNIFICANT RESULTS 





Source of Variation 














Scoring Grade Grade 
Categories Earned Reported Interaction 
Intropunitive 
Ego-Defensive 2.48 7.05** 5.24* 
Intropunitive 
Need-Persistent 7.73** 0.03 0.52 
Extrapunitive 
Total 4.24* 0.63 0.85 
Intropunitive 
Total 6.67* 0.52 0.96 
Need-Persistent 
Total 5.17* 1.26 0.02 





*Significant at 5% level. F needed for 1 and 75 4.f. 
is 3.97. 

**Significant at 1% level. 
is 6.99. 


F needed for 1 and 75 d.f. 


The results of further analysis of the varia- 
tion between groups in the case of each of 
these five categories appear in Tables 3 and 4. 
In Table 3 it may be seen that the only signifi- 
cant interaction between “Grade Earned” and 
“Grade Reported” occurs in the case of Intro- 
punitive Ego-Defensive responses, which in- 
volve self-blame, expressions of regret or the 
offering of excuses. Reference to the adjusted 
means in Table 2 and to the #’s in Table 4 
makes it clear that the central fact here is the 
decrease in such responses in Group 1, the poor 
students given their correct grades. As Table 
4 indicates, this is the only significant differ- 
ence between Groups 1 and 2 or Groups 3 and 


113 


4, and hence the only positive finding of the 
sort originally anticipated as a consequence of 
the experimental manipulations. 


TABLE 4 
CriticAL Ratios (¢) OBTAINED IN COMPARISON OF 
Eacu Pair or Groups iN CATEGORIES SHOW- 
ING SIGNIFICANT RESULTS 





Scoring Groups Compared 
Category 1-2 1-3 1-4 2-3 2-4 8-4 
Intropunitive 
Ego-Defensive 3.19*%* —2.45* -—3.02** 0.74 0.18 —0.57 
Intropunitive 
Need-Persistent 0.39 2.51% -—1.85 2.12* -1.47 0.66 
Extrapunitive 
Total 1.22 2.12% 2.02° 0.91 0.81 —0.10 
Intropunitive 
Total 1.21 2.53*° -—2.35* -—1.33 1.14 0.18 
Need-Persistent 
Total 0.89 1.63 0.70 2.53* -1.60 0.98 
*Significant at 5% level. required for 75 d.f. is 
1.99. 

**Significant at 1% level. ¢ required for 75 d.f. is 
2.64, 


However, the unanticipated differences be- 
tween good and poor students, as revealed in 
Table 3, are perhaps of just as much interest. 
The F’s in the table, taken together with the 
adjusted means of Table 2, show that regard- 
less of grade reported the poor students gave 
fewer Intropunitive Need-Persistent responses, 
fewer total Intropunitive, more Extrapunitive 
and fewer Need-Persistent responses. Leaving 
aside the low standing of Group 1 in the In- 
tropunitive Ego-Defensive category, already 
noted above, the differences in total Intropuni- 
tive responses seem to be traceable primarily 
to the Intropunitive Need-Persistent category, 
as do likewise the differences in total Need- 
Persistent responses. Thus the findings for the 
last four categories listed in Table 3 appear to 
add up to fewer Intropunitive Need-Persis- 
tent responses and more Extrapunitive respons- 
es on the part of poor students in this situa- 
tion. Intropunitive Need-Persistent responses, 
it may be recalled, are those in which the sub- 
ject offers amends, or takes it upon himself to 
solve the problem, while Extrapunitive re- 
sponses involve the direction of aggression to 
the external environment. It should be added 
that the above difference in the Extrapunitive 
category must be regarded as of borderline 
significance in view of the results of the “Be- 
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tween Groups” test reported in Table 2. 


DISCUSSION 


Consider first the finding of differences be- 
tween good and poor students. Judging from 
subjects’ comments, the good students were 
probably more frustrated on the whole than 
the poor ones. It seems unlikely, however, that 
this mere quantitative difference is responsible, 
because the difference in frustration level be- 
tween Groups 1 and 2 was almost certainly 
greater than that between good and poor stu- 
dents, and yet in the former case no difference 
was observed in these same scoring categories. 
Evidently some factor relating more directly 
to the distinction between “goodness” and 
“poorness” must be involved. It will be re- 
called that “goodness” and “poorness” in this 
case were defined with reference to grades 
earned on a single course examination. No at- 
tempt has been made to check the academic 
records of the subjects, but it seems probable 
that the examination grade would correlate 
with the subjects’ previous performance, and 
presumably also, therefore, with certain fac- 
tors of personality. Thus the observed differ- 
ences in response may reflect differences in 
characteristic reactions in a frustrating situa- 
tion. It would not be too surprising to find a 
somewhat greater tendency toward Intropuni- 
tive and Need-Persistent responses among good 
students. However, it should be noted that 
there were no significant differences between 
the two groups on the initial test; evidently 
it took the stress attendant on the return of an 
examination to bring out these differences in 
response tendency. If this interpretation is cor- 
rect, it might suggest that the full diagnostic 
value of the test is not realized under the pos- 
sibly less stressful conditions of ordinary test- 
ing. On the other hand, it is possible that the 
observed differences simply reflect differences 
in expectations with reference to this particu- 
lar examination rather than more general 
characteristics of personality. 

In any case, the finding of changes of the 
above sort, appearing as they do to be amen- 
able to reasonable explanation, wou'd seem to 
add to the presumption of test \ ‘lidicy. The re- 
duction in Intropunitive Ego-Detensive re- 
sponses in Group 1 may be considered in a sim- 
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ilar light. All of these differences are, of 
course, small. Certain features of the experi- 
ment may be responsible for this in part. Use 
of the test in the group situation may have 
permitted subjects an undue amount of delib- 
eration, and possibly reduced the reliability of 
scoring. Memory may have played a part also. 
Many responses on the final test closely re- 
sembled those on the initial one, and many 
subjects felt that memory had influenced their 
answers. However, the results of Franklin and 
Brozek [1], who found no differences in mean 
performance on P-F tests administered indi- 
vidually after six months of semi-starvation 
and again after three months of rehabilitation, 
suggest that the influence of the above factors 
may not be important. There remains in both 
the Franklin and Brozek study and the pres- 
ent experiment the possibility that group com- 
parisons may not adequately reveal frustration 
effects if there are extensive individual differ- 
ences in characteristic frustration reactions. 
And finally there is, of course, the possibility 
that the test itself is not adequately sensitive to 
such effects. 


SUMMARY 


The Rosenzweig Picture-Frustration study 
was administered to 80 students three weeks 
before and immediately after the return of a 
course examination. In the case of half of the 
40 “good” students (those earning 4 or B in 
the examination) the examination grade re- 
ported was reduced by two letter grades, while 
for half of the 40 “poor” students (those 
earning C or D) the reported grade was raised 
two letter grades. Application of analysis of 
covariance to the results in each of the 15 
Rosenzweig scoring categories revealed that: 
(1) good students given low grades did not 
differ significantly in any response category 
from good students given correct grades: (2) 
as compared with poor students given high 
grades, the poor students given their correct 
grades showed significantly fewer Intropuni- 
tive Ego-Defensive responses; (3) good stu- 
dents differed from poor students as a group 
in showing more Intropunitive Need-Persis- 
tent and fewer Extrapunitive responses. The 
results are interpreted as lending support to the 
validity of the test, although the small size 
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of the changes produced is stressed as possibly 
qualifying this interpretation. 


Received June 23, 1949. 
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A QUANTITATIVE COMPARISON OF PSYCHODIAG- 
NOSTIC FORMULATIONS FROM THE TAT 
AND THERAPEUTIC CONTACTS’ 


CARL H. SAXE 
NEUROPSYCHIATRIC HOSPITAL, VETERANS ADMINISTRATION CENTER 
LOS ANGELES, CALIFORNIA 


N clinical situations, the Thematic Apper- 

ception Test is often used to reveal the dy- 
namics of an individual as an aid to subsequent 
psychotherapy. From a practical standpoint, 
the value of the TAT decreases in proportion 
to the amount of time that must be spent in 
the interpretation of the protocols. A relatively 
quick survey is required, resulting in brief dy- 
namic formulations of the thema found in the 
separate stories, and in an integrated summary 
statement about the patient as revealed by the 
test as a whole. Such a procedure involves a 
marked departure from the method of TAT 
analysis advocated by Murray [4] in his man- 
ual. 

The purpose of the present experiment was 
to approach an assessment of the validity of the 
TAT by comparing diagnostic formulations 
obtained (a) from the TAT, and (b) from 
therapeutic contacts with the patient over a 
period of approximately four months. From 
an examination of 20 children with the TAT, 
a “questionnaire” of 83 items was constructed, 
containing all of the major thema produced 
by all of the children. For each individual 
child, the examiner indicated on the question- 
naire the thema obtained from that child’s 
TAT. Subsequently, a psychotherapist also in- 
dicated on the questionnaire the statements 
that were pertinent to an individual subject.* 

1The present study is part of a doctoral project 
completed at Teachers College, Columbia Univer- 
sity in June, 1947. The writer wishes to express his 
indebtedness to Dr. Laurance F. Shaffer, who 
served as chairman of his committee, to Dr. Ger- 
trude P. Driscoll and to Dr. Percival M. Symonds, 
who as members of the committee gave invaluable 
assistance in the conception and completion of this 
project. 

2See Appendix for a sample protocol and analysis, 
and for the questionnaire. 


The examiner’s formulations from the TAT 
were regarded as the “test” to be validated; 
the psychotherapist’s marking of the question- 
naire was regarded as the validating criterion. 

In the experiment, no attempt was made 
to obtain quantified relationships from refined 
estimates of needs, presses, and thema. Each 
story was considered as a whole, from which 
individual thema were derived. These thema 
were regarded as statements of characteristics 
of the patient, which might be confirmed or 
denied by the psychotherapist on the basis of 
his contacts. Similar methods have been advo- 
cated and used by others [1, 2, 3, 5, 6, 7, 8] 
and seem to be a common practice in research 
in clinical psychology. 

A number of precautions and controls were 
used to meet the requirements of scientific 
method. The TAT protocols were given a 
“blind” analysis. The therapists were not per- 
mitted to know the results of the TAT prior 
to committing themselves in writing as to the 
personality descriptions of the subjects. The 
diagnostic formulations from any individual’s 
TAT were concealed by including them in the 
mass of formulations for the total number of 
subjects. 


EXPERIMENTAL PROCEDURE 


1. Twenty children, ranging in age from 
9 to 17 with a mean age of 12 years and 6 
months, an IQ range of 83-188 and a mean 
IQ of 112, and consisting of 15 boys and 5 
girls, were selected from the psychotherapeutic 
case load of the Bureau of Child Guidance, 
Board of Education, New York City. The 
psychotherapists were qualified psychiatrists.* 

2. Criteria for selection of these individuals 
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were the interest of the psychotherapists in 
participation in the experiment, the fact that 
psychotherapy with these children had just be- 
gun or was about to begin (psychotherapy 
might well alter TAT production in an un- 
predictable manner), and finally that only 
children over the age of 8 could be accepted 
since prior to this age the predominant ten- 
dency of most subjects is to enumerate details 
of the pictures rather than to tell stories. 

3. The standard twenty-one cards of 1943 
edition of the TAT were administered in a 
single session. For elementary school students, 
the children’s set was used; for high school 
students, the adult set was used. The appropri- 
ate set was used for the sex of subjects. The 
following modification of the standardized in- 
used 


for administration 


throughout the experiment. 


structions was 


First 10 cards. This is a test of your imagination. 
I’m going to show you these (pointing to ov erturned 
set) pictures and I want to see how good a story 
you can make up about each of them. You may 
consider these pictures as being an illustration for 
a magazine story or a book and I want you to make 
up what you think that story might have been. Tell 
me what the characters are thinking and feeling, 
what might have happened before this scene, what 
is happening now, and how it might turn out. Make 
up as good a story as you can. There are no right 
or wrong answers. Try this first picture. 


Second 10 cards. These are harder because they are 
a little less definite. See how good a story you can 
make up with these. 


Card 16 (Blank). This card is a little unusual. It 
is a blank. I want you to imagine a scene or pic- 
ture and then make up as good a story as you can 
about this scene or picture. 


When subjects failed to include the major 
elements in a story or left some of the details 
ambiguous, nonsuggestive questions were asked 
freely. Sample questions were: “What hap- 
pened then?”, “What made him think that 
way?” or, “How does it end?” 

All stories and questions were recorded ver- 
batim. The time required for administration 


3A particular debt of gratitude is due Dr. Wil- 
liam G. Beckman, Dr. Virginia Moore and Dr. Hen- 
ry Wadsworth for their wholehearted cooperation in 
the psychotherapeutic phases of this experiment and 
to Dr. Morris Krugman for his kind cooperation in 
all phases of administrative and scientific work. 
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varied from 40 minutes to 2 hours, the varia- 
bility being due to length of the stories and the 
speed with which they were told. 

4. The TAT protocols were analyzed as 
described above. The themas from each story 
were listed separately and an overall interpre- 
tation of the twenty stories given by each sub- 
ject was made.* 

5. Well over 400 themas were found in the 
group of twenty protocols. Many themas were 
essentially the same and many overlapped. The 
themas were put in the form of statements 
about a subject and those from different sub- 
jects which overlapped significantly or were 
duplicated were eliminated. The remaining 
statements, taken together, constituted a ques- 
tionnaire. 

Some sacrifices of themas which might pos- 
sibly have been crucial were made in the in- 
terest of brevity of the questionnaire.® Brevity 
was important to maintain the high level of 
cooperation of the psychotherapists in answer- 
ing the questionnaire. 


It should be noted that the questionnaire 
as it was finally set up included themas from 
all twenty subjects. This procedure was fol- 
lowed to help insure that undue bias might not 
result from the psychotherapist receiving a 
questionnaire based on the TAT findings of 
the experimenter on a single subject. This fac- 
tor also made it necessary to include a “no in- 
formation” (ni) category in the questionnaire 
since many of the statements would not apply 
to a single subject. Another reason for inclu- 
sion of the “no information” category was to 
take into account the possibility that TAT an- 
alyses and information derived from psycho- 
therapy might not coincide. 


6. The questionnaire was then answered 
by the experimenter for each subject on the 
basis of TAT data with the exception that 
name, age, sex, school grade, IQ and family 
composition were available to the experimenter. 

7. After a period of approximately four 
months of continuous psychotherapy in a given 
case, the questionnaire and instructions for 


See Appendix I for sample protocol and analysis. 
5See Appendix II for sample completed question- 
naire. 
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answering the questionnaire were given to the 
three psychiatrists. Despite precautions in the 
preparation of the wording of questionnaire 
items and instructions (which involved checks 
by a number of qualified consultants), ambig- 
uities and misunderstandings of unknown mag- 
nitude were revealed in the final processing of 
the questionnaires. 

The period of four months was regarded by 
the psychotherapists as adequate for obtaining 
the information necessary for completing the 
questionnaire. ‘he psychiatrists had available 
to them not only the information derived from 
psychotherapy but, in most instances, case his- 
tory material assembled by a social worker, 
reports from psychologists who gave a size- 
able battery of psychological tests including the 
Rorschach (but not the TAT) and supple- 
mentary information from parents and rela- 
tives with whom they frequently had contacts. 

The following are the instructions as given 
to the therapist and as followed by the experi- 
menter in answering the questionnaire: 


1. Since this experiment required “blind analy- 
sis” of the TAT protocols, there was severe limita- 
tion placed on the extent to which refinements or 
minute interpretations could be made. Consequently, 
only the pressing adjustment problems, those prob- 
lems with which the children are actively and in- 
sistently struggling, are to be checked on the ques- 
tionnaire. 

2. It would seem possible to answer a great many 
of these questions either affirmatively or negatively, 
rather than “no information” by resorting to our 
knowledge of personality growth and development 
and inferring the existence of attitudes and feelings 
from behavior evidences in therapeutic contacts. 
However, the questionnaire was mot designed with 
this in mind. The experiment calls for answering 
“Yes” or “No” to those questions in which positive 
evidence exists that these statements represent “core” 
adjustment problems or “core” personality traits. 
All other questions would be checked “no informa- 
tion”. 

3. There are many questions which have two or 
more component parts. Each question must be con- 
sidered as a unit, thus: “S is hostile toward mother 
because he feels rejected” would be answered “Yes” 
if he is both rejected and hostile and might be an- 
swered “No” or “no information” if he is rejected 
but does not feel hostile or if he is hostile but does 
not feel rejected. 

4. In these questions, “mother” and “father” may 
be understood to represent either mother or father 
or a mother-figure or a father-figure. 
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5. “S” stands for “subject” or, the particular 
child to which statement refers. 

6. For statistical reasons, it is important that 
you mark every question in one of the three scoring 
categories. 


The responses to the questionnaire were 
treated statistically by comparing the extent of 
agreement between therapist and experimenter 
on each of the items, where both committed 
themselves to a “‘yes”’ or “no” response. While 
this precluded the possibility of statistically 
treating information which the therapist might 


TABLE 1 
NUMBER OF ITEMS OF AGREEMENT AND DISAGREE- 
MENT BETWEEN THERAPIST, EXPERIMENTER 
AND PROBABILITY OF AGREEMENT EX- 
CEEDING CHANCE EXPECTANCY 


Case 


Number Agreement Disagreeement Probability 
1 35 7 01 
2 28 21 3 
3 38 17 01 
4 22 13 a 
5 19 11 3 
6 22 23 a 
7 33 10 01 
8 50 9 01 
9 24 27 —.7 

10 24 27 —.7 
11 27 32 —5 
12 38 22 .02 
13 31 19 1 
14 36 18 01 
15 30 17 01 
16 38 15 01 
17 37 19 01 
18 39 18 01 
19 18 13 4 
20 20 17 6 


have on the patient and the experimenter might 
not, or vise versa, an experimental design in 
which it was desirable to conceal information 
obtained about a single subject among the to- 
tal number of cases suggested no alternative 
method. 


RESULTS 


Table 1 itemizes by case number the raw 
number of agreements and disagreements be- 
tween therapist and experimenter and the 
probability of agreement exceeding chance 
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expectancy for each case. The significance of 
this agreement for the total population was 
calculated by the formula 

Soni 


S 


N—1 


which by use of Student’s distribution for 
small populations revealed that P = .0002. 
Thus for the total population agreement be- 
tween therapists and clinician could not have 
occurred by chance more than twice in ten 
thousand times. The significance of agreement 
for each separate case was calculated by the 
formula 


and again referring to Student’s distribution, 
the probabilities tabulated in Table 1. 

It will be noted that significant agreement 
(below .05 level) was obtained in half of the 
cases (10); that agreement was positive but 
cannot be considered statistically significant in 
six cases and that disagreement exceeded agree- 
ment in the remaining four cases. Thus, from 
another aspect, the data support the probability 
that agreement between therapist and experi- 
menter significantly exceeds chance expectancy. 

That significant agreement is found is not- 
able despite the inherent difficulties in attempt- 
ing to quantify highly subjective data. Seman- 
tic problems for both therapist and experi- 
menter, as well as “blind” diagnosis by the ex- 
perimenter (an unusual and undesirable pro- 
cedure in an actual clinical situation), prob- 
ably made for greater divergence in diagnostic 
formulations than might actually have been 
the case if the therapist and experimenter had 
pooled information even to a limited extent. 
These problems might have been avoided if it 
had been possible for the experimenter and 
therapist to discuss the meaning of each ques- 
tionnaire item with reference to the particular 
patient. 

Following the completion of the question- 
naires an informal attempt was made to check 
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on the possibility of agreement being increased 
by conferences between therapist and experi- 
menter. In a discussion of two cases it was 
found that the statistically revealed disagree- 
ment was, in many instances due to differences 
in subjective judgment as to the weight to be 
attached to the various aspects of personality 
revealed, to differing interpretation of the 
questions and to aspects of personality not 
clearly revealed in therapeutic contacts but 
which, when brought into focus by the sup- 
plementary material from either TAT or 
therapy, was recognized by both as offering 
fruitful reorientation to aspects of the per- 
sonality previously noted but given insufficient 
weight due to the tenuousness of the insight. 

A check was made on the extent of agree- 
ment between therapist and experimenter on 
each of the items of the questionnaire. If thera- 
peutic contacts were accepted as a valid criter- 
ion against which diagnostic formulations de- 
rived from the TAT may be measured, this 
analysis would reveal in what diagnostic areas 
the TAT could be most and least effective. 


The formula 


p— .50 


Ss 


Pq 


N 


was again utilized applying it to each of the 
83 items for the N of 20. The results of this 
statistical analysis are given in Table 2. 


The items of the questionnaire are divided 
into twelve diagnostic areas, which are ap- 
proximations of the trend of the major themas 
revealed by TAT analysis. The number of 
items in each of the areas varied not with the 
diagnostic importance of the area but with the 
diversity of the thematic material which was 
revealed by the protocols. 


Examination of Table 2 reveals that there 
are 21 items in which agreement assumes sta- 
tistical significance, one item in which disagree- 
ment is statistically significant, and a relative- 
ly large number of items which fall into the 
“no-man’s-land” of neither significant agree- 
ments nor disagreements. In terms of diagnos- 
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TABLE 2 
Tue Levew or SIGNIFICANCE OF QUESTIONNAIRE ITEMs BY D1AGNostic AREA AND ITEM NUMBER 
, “4 To, cai Disagree- 
Perfect Agreement at Agreement Agreement Disagreement ment 
Diagnostic Agree- 5% Level or Not Equals Not at 5% Level 
Area ment Better Significant Disagreement Significant or Better 
lL. Relationship 
with Parents.................... 1,9,10 2,3,6,7,8 4,5 
II. Relationship 
ee 19 17 12,14,20,21 15,16 11,13,18 
III. Relationship 
with Father..................... 28 25,27 22,23,2 24 
IV. Relationship 
TT 29,30 $1 
V. Socialization................. 32 33,34,35,36 
VI. Sexual Activity. ............ 40,42 37,39 41,43,44 
VII. Attitudes toward 
Achievement.................... 48 45,46,47 
VIII. Self-Evaluations............. 50 51,54,55 49,53 56 52 
IX. Reaction 58,59,61 57,62,63 
Ee 60,68 65,70 66,67,69 64 
X. Ability to 
Form Affectional 
Relationships bitdniaai 71,75 72 74 73 
XI. Attitudes toward 
Growing-up..................... 76,77,79 78 
XII. Reaction to therapy..... 81 80,82,83 
0 SA ET 16 87 8 15 1 


tic areas, some suffer more from this fate than 
do others. Area I, that of relationships with 
parents, area VIII, relating to self-evaluations 
of the subjects, and area 1X relating to reac- 
tions to anxiety, are more heavily weighted on 
the side of agreement. On the other hand dis- 
agreements appear to predominate in area II, 
relationships with the mother ; area VI, sexual 
activities and conflicts; area VII attitudes 
toward work and achievement; and area X, 
ability to form affectional relationships. 


SUMMARY 


The study indicates then, that for these 
twenty individuals who are known to have 
emotional and behavior problems (by reason 
of the fact that they have been accepted for 
psychotherapy), the TAT offers general di- 
agnostic clues similar to those gained from 
therapeutic contacts. The evidence is not over- 
whelmingly strong, even though the signifi- 
cance of agreement between the two methods 
of diagnosis as a whole is relatively high. 

Several hypotheses might be offered which, 
if followed up in further studies, might over- 





come the shortcomings of the present study: 


1. The method of “blind analysis” is not 
justified with the TAT. Diagnosis achieved 
through this instrument may be too dependent 
on the corrective influence of the known facts 
of the subject’s environment. 

2. Fantasy material such as is obtained in 
TAT protocols may, while being a valid de- 
scription of deeper personality levels, not be 
comparable to the material obtained in four 
months of therapeutic contacts. Thus, both 
therapist and experimenter may be equally 
correct at different levels of personality de- 
scription. 


3. Semantic misunderstandings were cer- 
tainly involved in the use of the questionnaire. 
Another method, or refinement of this method 


of quantitative comparison, may yield better 
agreement. 


4. The area of investigation, personality 
as a whole, when statistically broken down in- 
to separate questionnaire items, falls subject to 
distortions of interpretation and representa- 
tion. 

5. The method of “thematic analysis” 








—— 
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though it saves time and is expedient in the 
clinical situation, may not be the method which 


provides the most relevant data for diagnosis 


in psychotherapy. 


APPENDIX 


I. A TAT protocol and Thematic Analysis. 


II. The questionnaire (scored for this case by therapist and experimenter. ) 


I. A TAT Prortoco. 

Selection of the Protocol. Case no. 16 was selected as being fairly representative of the general run of 
protocols. This is a boy who is twelve years old, is in the sixth grade, tests on the Stanford-Binet at IQ 97, 
lives with his mother, a step-father, four brothers ages 11, 13, 14, and 17, two sisters ages 8 and 16 and 
one step-sister age 5. Nothing further is known to the experimenter about this case. The TAT series for 


boys was used, 


THE PROTOCOL 
Picture no. 1 
This little boy is thinking what he can play and in 
the picture he’s studying music so he can give a 
concert. Right now he’s studying the violin and 
when he has it all figured out then he’s going to 
play. 


Picture no. 2 

Dragonseed—the woman is watching her brother 
plow cornfields. You can see it’s a hot day because 
he has his shirt off and she’s wondering, saying to 
herself what a nice food we are going to have for 
the winter. the older sister is wondering where she 
should go for her education — wondering what 
trade she should take up. She feels sad because she 
doesn’t know. (How does it end?)® She hopes, the 
one leaning against the tree, that he makes it all the 
way through the field. (What happens with the 
other one?) Finally she gets a good job, comes here 
to visit her sister to see how she is making out. 


Picture no. 3 

This is a boy that’s sleepy. He’s so tired that he 
didn’t have strength to get into bed. He falls asleep 
on the floor he’s so tired. He didn’t have strength to 
get on the couch. (Why was he so tired?) He was 
downtown making money, like I do, only he was 
more tired. 


Picture no. 4 

This is a soldier that’s going away overseas and 
that must be his wife telling him ‘take care of your- 
self’ and all that. While he’s fighting overseas, the 
Japs come and they raid him and he saved his 
buddies. While on Guadalcanal, he landed on a 
mine just so he can save his buddies. He sent his 
wife a letter and all her life she’s trying to forget 
what happened and she couldn’t forget and she 
must have doublecrossed him—she looks like she has 
evil eyes, like she’s after money instead of love. 


THEMATIC ANALYSIS 


Hero works methodically in order to achieve. No 
negative or positive affect in relation to work. 


Story is centered on sibling relationships. They all 
suffer economic hardship but are supportive of one 
another. Hero is successful in quest for economic se- 
curity. 


Subject identifies strongly with hero who is exhaust- 
ed from over work. There is acceptance of construc- 
tive work activity but there is a feeling of hardship. 


Evaluation of love relationship with a woman 
switches during the telling of the story. At first there 
seems confidence in the existence of a real affection, 
later this turns to hostility and suspicion as the 
woman gives evidence of betraying him. 
Hero achieves glory through self-sacrifice. 


®Questions by the experimenter are given in pa rentheses. 
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THe Proroco. 


(How does it end?) So she was blackmailing him 
and he didn’t know it, when he died his ghost came 
to haunt ber. She couldn’t take it so she went to the 
police. She told the police she was bribing him 
and they put her away in jail — they gave her a 
sentence. So, everybody forgot about it. 

It might be another story. He wasn’t going over- 
seas, he just came back. While he was over, she 
was making love to somebody else and she just got 
his money. When he came back he had heart 
trouble and soon he died and came back and haunted 
her. Soon she gave herself up to the police and they 
gave her a sentence of ten to twenty years. 


Picture no. 5 

That’s a lady that’s walking in the door. She was 
going to read something but she sees somebody 
hiding. She got so scared she lost her voice. She 
stands there paralyzed. She calls the police. Before 
that she ordered some groceries. Nobody answered 
the front door so he went inside the back door and 
he saw her paralyzed. He tried to talk to her and 
found out that she died of fright and he told the 
police. The cops came and took pictures of her, the 
way she died. 


Picture no. 6 

This is a stormy night and the bridge was washed 
out by the rain. The mother is wondering how her 
son will get home and he’s wondering how he will 
get home. After a couple of days of the hurricane 
and sleet, the repairmen come and they fixed the 
bridge, made it more stronger — had to take more 
time. The son came home and the oldest son that 
was married came home too. 

He’s wondering if his brother-in-law had died in 
the storm. In fact on his way home he met him and 
they were all happy and he went home by himself. 


Picture no. 7 

Over here they are arguing. He’s arguing with his 
rich uncle because he said he was going to change 
his will. The uncle said, ‘you are waiting for me to 
die so you can have my money. I’m not going to 
give you nothing.’ The nephew was so mad, he 
came home one day and shot him so he could get 
the money before he changed it. A couple of months 
later a detective was still working on the case. He 
found a clue and questioned him on the sly. He 
asked him why he killed his uncle — to get money? 
So, he pulled out a gun and tried to shoot them 
(the detectives), but one of the detectives waited 
outside. He saw him pulling out a gun and shot 
him in the wrist. They booked him on first degree 
murder charge and he was sentenced to ten to 
twenty years imprisonment and so ends the life of a 
greedy nephew. 


THEMATIC ANALYSIS 


Aggression against woman (mother-figure) — wo- 
man dies, hero is instrumental but not really re- 
sponsible for her death. 


Rather confused story, possibly due to repression of 
affect. Fear of natural elements present, conquest of 
natural elements through patience. Further empha- 
sis on family unity and harmony. 


Hero kills uncle (father-figure) because of uncle’s 
refusal to provide economic security. Hero is caught 
and punished. There is mild disapproval of hero’s 
action. 
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Tue PrRoToco. 


Picture no. 8 

This is a son that’s worrying about his uncle — no, 
about his older brother, whether he is going to die 
or not because somebody shot a bullet into his side 
next to his appendicitis. The doctor was there try- 
ing to get the bullet out, with his assistant. That’s 
all. (pause) They took the bullet out but it didn’t 
do no good because he didn’t do nothing about the 
appendicitis before and it got ruptured so, he died 
from the bullet wound and appendicitis. (Who 
shot him?) He was climbing over somebody’s 
property trying to get some apples and the farmer 
shot him. They didn’t put no charge on the farmer 
but they had a trial and he was held on bail. 


Picture no. 9 

These are a bunch of kids laying out on the sunny 
grass after a long hike — Boy scouts (contemptous- 
ly). They came from a seven mile hike to a — they 
were just a couple of Boy Scouts on this long hike. 
They were real tired. (Did anything exciting hap- 
pen?) No, it was a dull hike. 


Picture no. 10 

This is out in the garden. They are worrying about 
something. It’s about money. They ain’t got no 
money to pay for the apartment they got. They are 
very poor. The husband is trying to comfort the 
wife even though he is worrying about it too. He 
can take care of it himself. That’s all. (How does 
it end?) They couldn’t raise enough money and 
they are evicted and they have to separate — that’s 
all. They have to find another house to live in — an 
old shack where nobody ever goes. Where they were 
living at a long time ago, there used to be a robber 
who lived there. He used to rob banks. They found 
under the cupboard a board and under the board 
they found a whole stack of hundred doller bills. 
There was five thousand dollars reward. They got 
it and bought a house and food and clothes and he 
got a job. Soon the $5000 was gone but he had a job 
and supported his wife and kids. 


Picture no. 11 

This is out in the deep forest where there’s a water- 
fall and a long time ago cavemen used to live there 
— a couple of hundred years before Christ was 
born. These cavemen used to go hunting for wild 
game. They took a photograph and this is how it 
looks, now. 


Picture no. 12 

In this town people didn’t go robbing and there was 
no crime. They could leave their rowboat out in the 
open and on a nice day they could go out rowboating 
and fishing. People could be trusted then. 


THEMATIC ANALYSIS 


Hostility against uncle, later older brother (father- 
figures) expressed. Hero feels guilty but evades this 
feeling by pointing out that father-figure was really 
the antisocial one and moreover failed to take care 
of himself. 


Speaks contemptuously of Boy Scouts, rejecting 


socialized play. Need for passivity after effort. 


Hero struggles against economic hardship. Wants 
to take responsibility for his family. Envisions a 
stroke of fortune which enables him to help family. 
There is an underlying feeling of inadequacy on the 
part of the hero. 


No emotional involvement in the story perceptible. 


Nowadays, people cannot be trusted. 
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Picture no. 13 

This is in — way out in Oklahoma. This here young 
boy was looking at the cowhands when they were 
feeding. The boy is watching his father feeding the 
pigs and taking care of the horses and cows. He’s 
thinking one of these days, I'll be doing just what 
my father is doing now. He became one of the best 
cowboys in the country. He had a horse named 
Trigger. His name is Roy Roger. 


Picture no. 14 

This is a boy in the night. He’s looking out at the 
full moon — how beautiful it is, and how it shines 
on houses and streets. It’s around six AM, it’ a 
little dark. He watches how the moon goes away 
and how the sun comes up. He thinks it’s all pic- 


turesque. (What happens then?) M-m-m-m. 


Picture no. 15 

This is an undertaker. He’s standing beside—He’s 
an evil man and he’s standing beside the grave- 
yards, planning to steal some bodies and then go 
with his friends. He waits to see if somebody rich 
is coming. He’s with a lady and he says ‘you killed 
him—killed him!” She says, “Oh, no—I’ll give you 
anything, my reputation! They say, ‘Give me ten 
thousand dollars and I'll forget what I saw and get 
rid of the body. He kept on doing this and made a 
million dollars, but he was still greedy. ‘I’m going 
to take one more body and then retire.’ The police 
set a trap and took infrared pictures when he was 
taking the body out and caught him and sent him 
to prison for life. Crime does not pay. 


Picture no. 16 (blank card) 

These is a bunch of boys fooling around and want- 
ing to get into the subway. No, they are all to- 
gether, just playing trying to get into the subway. 
Climb in different ways. They play hookey in the 
subways. Then they jump at the poles, swing all 
over, make a racket. That’s all. And so one of the 
boys gets killed and they decide they shouldn’t do 
it any more because it’s too dangerous. They were 
sent to reform school for three years. When they 
came out they decided not to do anything bad any 
more. Right now they are in high school studying 
a trade. 


Picture no. 17 
This is a boy —. 
street. He climbed 


Yankee Stadium is across the 


up a rope to see the World 


Series games and he enjoyed the game so much, he 
forgot where he was and started clapping. He start- 
ed falling down. It was his life or burned hands. He 
grabbed the rope. He said, I’d rather pay a dollar 
than be killed. I hope this teaches the other kids. 
I’ve learned my lesson.’ 
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‘THEMATIC ANALYSIS 


Hero identifies with father as he engages in con- 
structive work activity. Hero achieves fame through 
work. Identification with father is in contrast with 
previous hostility to father-figures. 


Hero enjoys aesthetic experience. 


Hero is anti-social (a grave robber and black- 


mailer). He is caught and retribution takes place. 
Story ends with a moral platitude which is taken as 


a rejection of morality. 


Hero identifies with delinquents and engages in de- 
linquent activity. He recognizes social consequences 
and punishment makes him conform. 


Hero engages in delinquent activity, is physically 
hurt and decides to conform. He moralizes. 
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Tue PRoroco. 


Picture no. 18 

This man was coming out of a rich bar and there 
was a crook coming out of an alleyway. He snatched 
him in and robbed him. Now he’s in the hospital 
suffering from a fractured brain, et cetera. They 
never caught the robber but they’ll catch him some 
day. 


Picture no. 19 

This is supposed to be a spookey house. The trees 
come out like that all shaped in different places. The 
windows come out like eyes. There’s a portrait of 
two men, one man in each window and they are 
looking out at the cold. They look like they are 
frozen stiff. There’s no chimney. That’s all. (Make 
up a story!) This is the coldest day of the year and 
the snow is falling and there are people in there 
frozen stiff. They are wondering when summer 
comes and when summer comes they worry about 
the winter. They always have worries on their 
minds, They can’t go out and get the wood. All 
they can do is wail — hours seem like years. Finally, 
the time comes. They choose a new house, next to 
a store and it has a back entrance which looks from 
the inside in case of burglaries and then they get 
food that way. 


Picture no. 20 

This is a foggy day. That’s a hobo. This man has 
no place to sleep. They usually sleep on park 
benches but the park was closed and guarded by a 
policeman. He got another park by this one he did- 
n’t like it. Here he’s standing by a lamppost. It’s a 
cold night and he’s wondering what to do. He got no 
cigarettes and he’s hungry. A neighbor knows he’s 
just a bum and can’t get along. He doesn’t want to 
help him all the time but he invites him upstairs and 
gives him a dinner and some old clothes and lets 
him go his way. (What happens then?) He didn’t 
hear from him no more. 
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THEMATIC ANALYSIS 


Hero successfully escapes punishment for delinquent 
activity. He makes token gesture that crime does 
not pay. 


Element of terror in fantasy. Family experiences 
enormous economic deprivation and physical hard- 
ship. Somehow, in unexplained manner, they achieve 
security, both economic and against theft. 


Hero down and out, economically deprived. Police 
are antagonistic, not sources of help. He is befriended 
by a neighbor. 


Summary Emotional ties to family are strong but there is unrelenting hostility to the father. He feels he 
cannot rely on the love of women but cannot quite bring himself to actively reject them. The feelings of 
hostility toward the father cannot be completely accepted by him. On the other hand he does little to over- 
come these feelings. He fantasies himself in the father role, particularly as a better provider than his father. 
Underneath this fantasy there is anxiety about his ability to maintain this position. He feels a real re- 
sponsibility for his family’s welfare but does not feel quite adequate. At times he envisions a stroke of luck 
which enables him to aid them. 1 


Delinquent tendencies are extremely active. He pays lip service to socialization as there develops a super- 
ficial realization of the existence of social punishment. 


Il. THE QUESTIONNAIRE 
The questionnaire reproduced below has been an- 
swered by both therapist and experimenter for case 
no. 16. The following code has been used to indi- 
cate the answers of therapist and experimenter: E 


means “experimenter”; T means “therapist”; + 
means “yes”; — means “no”; and ni means “no in- 
formation”. Thus, (E +, T ni) means experimenter 
“yes” and therapist “no information”. 
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Subject: no. 16 

1. S has been involved in active struggle for 
parental understanding. (E +, T+) 

2. § tends to live in social isolation since he is 
fearful and suspicious of both parents and friends. 
(E —, T +) 

3. S reacts to rejection by parents with fear, 
hatred, and desire to revolt. (E ni, T +) 

4. S is unhappy in the home and desires to 
handle this by running away — is prevented from 
doing this by fear of the consequences. (E —, T +) 

5. S is emancipating self from a controlling 
family but is finding the progress difficult and 
counteracts with aggression against them. (E —, T 
+) 

6. S in conflict about whether or not to forego 
own ambitions and pleasures or sacrifice these for 
the family. (E +, T+) 

7. S$ lives completely within the realm of the 
family situation — does not venture out of the situ- 
ation to establish relationships with friends. (E —, 
T—) 

8. S desires to be head of family — to act in 
father role. (E +, T —) 

9. Need to help family financially is used by S 
as a justification for delinquency. (E +, T —) 

10. S has crystallized death wishes for both par- 
ents. (E —, T ni) 

11, S is hostile to the mother. (E +, T ni) 

12. Hostility toward the mother is unconsciously 
generalized to all women. (E +, T +) 

13. S is troubled with guilt feelings as a result 
of his inability to check his feelings of aggression 
toward the mother. (E +, T +) 

14. S has a strong need for love and protection 
from the mother. (E +, T +) 

15. S feels strongly positive toward his mother, 
but does not feel this feeling is reciprocated. (E —, 
T +) 

16. S needs the affection of the mother to the 
extent that he is willing to allow his other needs 
and goals to be subservient to hers. (E —, T —) 

17. S wants mother for himself but fears ag- 
gression from father — wants father’s sanction. 
(E ni, T '+) 

18. S wishes to gain the love of a rejecting 
mother through manipulation of situations in his 
favor. (E +, T +) 

19. Eroticized feelings toward the mother are 
present. (E ni, T ni) 

20. S$ is in competition with mother in intellec- 
tual achievement. (E —, T —) 

21. S$ reacts to being pushed into achievement 
by the mother. (E —, T —) 

22. S$ in active competition with the mother for 
the affection of the father. (E —, T —) 

23. S has a strong need for love and protection 
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by the father. (E —, T +) 

24. S has a strong need for love and protection 
by the father but when he asserts his individuality, 
he feels in danger of being rejected by the father. 

25. §S is hostile to his father. (E +, T +) 

26. Hostility of S toward father has overt ex- 
pression. (E +, T +) 

27. S has crystallized death wishes for the father. 
(E +, T ni) 

28. S in conflict as to whether or not he has 
mother to himself — fears he shares her with father. 
(E ni, T +) 

29. S feels rejected by friends. (E +, T+) 

30. S keenly wishes acceptance by friends but re- 
alizes he is socially inadequate. (E —, T+) 

31. Relationship with friends of the opposite sex 
are contingent upon being in ascendent position. 
(E ni, T +) 

32. S has well-developed knowledge of right 
and wrong. (E +, T +) 

33. S accepts anti-social behavior. (E +, T +) 

34. S accepts anti-social behavior, but he has 
guilt about this behavior. (E +, T +) 

35. S has knowledge of right and wrong but 
would continue with anti-social behavior except for 
the fear of punishment. (E +, T +) 

36. S has need for social status but accepts an- 
ti-sociality as a means for attaining this. (E+, T+) 

37. S is able to establish adequate heterosexual 
relations. (E +, T +) 

38. S has strong tendencies toward homosexual 
relationships. (E ni, T +) 

39. Sis preoccupied with sexual thoughts. (E ni, 
T ni) 

40. S reacts to sexual thoughts with guilt feel- 
ings. (E ni, T +) 

41. S fears women. (E +, T +) 

42. S is generally hostile toward persons of the 
opposite sex. (E —, T +) 

43. The tendency of S to show hostility toward 
women provokes guilt and anxiety. (E +, T +) 

44. S is able to relate on an affectual basis to 
boys although there is active hostility toward the 
father. (E +, T +) 

45. S accepts the need for constructive work ac- 
tivity. (E +, T +) 

46. S has strong need for achievement (ambi- 
tious). There is acceptance of the concept that work 
is necessary for achievement. (E +, T +) 

47. S feels oppressed by financial insecurity. 
(E +, T +) 

48. S has strong need for achievement, however 
he fantasies rather than works for success. (E +, 
T +) 

49. S values or seeks aesthetic expression. (E ni, 


T +) 
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50. S is interested in aesthetic experience and 
would like to withdraw into this. (E ni, T +) 

51. S identifies with intellectual social deviates 
(bohemians). (E —, T —) 

52. S feels physically inferior. (E +, T —) 

53. S has generalized feelings of inadequacy and 
self-rejection. (E +, T +) 

54. S feels inadequate and emphasizes health 
and physical strength as a means of resolving these 
feelings. (E+-, T+) 

55. S feels unaccepted socially and worthless 
personally, he wants regression or rebirth. (E ni, 
T ni) 

56. S has a high self-estimate — feels that others 
do not regard him so highly. (E ni, T +) 

57. S reacts to anxiety with overt aggressions. 
(E +, T ni) 

58. S reacts to anxiety with the use of humor as 
a defense. (E ni, T +) 

59. S$ reacts to anxiety with terrorfilled fantasies 
and suicidal wishes. (E ni, T ni) 

60. S reacts to anxiety by retreating to regres- 
sive fantasies. (E ni, T —) 

61. S$ reacts to anxiety with appeal to religion 
or personal philosophy. (E ni, T ni) 

62. S reacts to anxiety with successful suppres- 
sion. (E —, T ni) 

63. S reacts to anxiety with fears of death. 
(E ni, T —) 

64. S seeks actively to develop personal criteria 
for emotional, social and economic living. (E —, 
T +) 

65. § self-centered and seeks to 
create advantages for himself by using superior 
verbal abilities. (E ni, T +) 

66. S tends to evade the problems of reality by 
fantasying success and self-realization. (E +, T+) 

67. To S, money is the most obvious and mean- 
ingful method of gaining social ascendancy. (E +, 
T +) 

68. S feels physically unattractive and minimizes 
this fault in favor of the creation of beauty. (E ni, 
T —) 

69. S is constricted as a result of emotional con- 
flicts — is not free to move positively or negatively 
in any direction. (E —, T —) 

70. S idealizes non-aggression. (E —, T ni) 

71. S is capable of forming affectional relation- 
ships. (E +, T+) 

72. S is aggressive and retaliative because of in- 
ability to form affectional relationships. (E —, T —) 

73. S$ attempts to hold onto affectional relation- 
ships by submission and resignation. (E —, T +) 

74. Present inability of S to form good affection- 
al relationships is due to inconsistent parental at- 
titudes toward him. (E ni, T +) 

75. Present inability of $ to form good affection- 


is consciously 
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al relationships is due to accumulated frustration 
and disappointment arising from previous efforts. 
(E ni, T +) 

76. The attitude of S toward growing up is one 
of recognition and acceptance of the responsibilities 
involved. (E +, T +) 

77. The attitude of S toward growing up is one 
of avoidance of the problems involved. (E —, T —) 

78. S recognizes that maturity involves the ac- 
ceptance of constructive work activity and strives 
for this. (E +, T +) 

79. S$ is fearful of new situations, reacts to them 
with increased passivity. (E ni, T +) 

80. S developed transference very early in ther- 
apy. (E—, T ni) 

81. Anti-social drives are held in check by S by 
reason of transference to the therapist. (E ni, T +) 

$2. S reacts to therapy as a threat because of the 
anxiety and guilt aroused by revealing his real feel- 
ings; consequently, he avoids any situation which 
might involve him emotionally. (E —, T ni) 

83. S has long-standing distrust and resentment 
which has taken the form of anti-sociality. There 
is little anxiety present and consequently therapy 
has been made difficult. (E +, T —) 


Received June 20, 1949. 
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CHANGES IN THE GALVANIC SKIN RESPONSE 
ACCOMPANYING THE RORSCHACH TEST" 


JEANNE R. LEVY 


UNIVERSITY OF PENNSYLVANIA 


HILE the validity of the Rorschach 
test has been established clinically 
[10], little has been done to evaluate the tech- 
nique in terms of extraclinical criteria. That 
“the test gives or<s‘ation as to the affective 
status of the subject” (20, p. 97], particularly 
through the colored and the heavily shaded 
cards, has been inferred by Rorschach workers 
on the basis of the cumulative evidence of 
thousands of verbal Rorschach responses cor- 
related with clinical symptoms [3, 12, 20]. 
In view of the widespread use which the Ror- 
schach currently enjoys as an aid to personality 
evaluation, it seems desirable to find some ex- 
perimental verification of the so-called “‘affect- 
ive” or “emotional” value of the cards. 
Duffy [6] has pointed out that the mos‘ 
characteristic feature of the condition called 
emotion appears to be a change in energy level. 
A peripheral index of such a change in energy 
mobilization is to be found in the change in 
palmar skin conductance commonly called the 
galvanic skin response, or GSR [5, 7]. Palmar 
skin conductance has been found by a number 
of investigators to increase in situations de- 
signed to produce “emotional tone” or to pos- 


1This article is based upon the writer’s Master’s 
thesis of the same title, which is on file in the 
University of Alabama Library. The writer wish- 
es to express her appreciation to Dr. Paul S. Siegel 
and Dr. Oliver L. Lacey for their indispensable 
guidance, to Dr. Francis W. Irwin of the Uni- 
versity of Pennsylvania for his helpful sugges- 
tions concerning preparation of the manuscript for 
publication, to the University of Alabama Re- 
search Committee for making certain facilities 
available for the execution of the experiment, and 
to the Kappa Kappa Gamma Fraternity and the 
Sigma Delta Tau Fraternity, who supported her in 
part with fellowship grants during the academic 
year in which the research was conducted. This 
paper was read in condensed form at the 1948 
meetings of the Southern Society for Philosophy 
and Psychology. 
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sess “affective value” [1, 8, 9, 11, 17, 18]. 

The GSR has been measured during the 
word-association test, another commonly used 
diagnostic tool based upon the “emotionai” 
value of the stimulus to the subject. Smith 
[22] reported that the GSR detected and 
measured “affective tone elicited by word-as- 
sociation tests.” Jones and Wechsler [11] 
found “a high degree of dependability” of the 
GSR as an indicator of the affective values of 
words. Dysinger concluded that the magnitude 
of the GSR “. .. is roughly indicative of the 
affective response of (the subject) as elicited 
by stimulus words” [8, p. 29]. 

It should follow then that presentation of 
the Rorschach cards—believed to be “‘affective- 
ly toned” — should be accompanied by an in- 
crease in palmar skin conductance, and that the 
increase should be greatest upon presentation 
of those cards believed to have the most “af- 
fective value,” the colored and the heavily 
shaded cards. Evidence for such an increase in 
conductance would serve to strengthen Ror- 
schach theory. 


Search of the literature has revealed the 
publication of results of two investigations of 
the relationship between the GSR and “color 
and shading shock.” Milner and Moreault 
[16] reported agreement between Rorschach 
and galvanometric indicators of the existence 
or nonexistence of shock for a given individual, 
but unfortunately their study was not de- 
scribed in sufficient detail to permit evaluation. 
Rockwell and his co-workers [19] have inves- 
tigated “color shock” and psychoneurotic re- 
actions. Their 10 psychoneurotic patients pro- 
duced fewer verbal responses and a lower mean 
GSR than did either 10 normal subjects or 10 
normals whose Rorschach protocols showed 
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clear-cut evidence of “color shock.” 

The study reported here is not concerned 
solely with the concept of “color shock” as the 
above-mentioned studies have been, but seeks 
rather to substantiate in terms of a peripheral 
indicator any “affective value” which the Ror- 
schach cards may possess. The explicit pur- 
pose of the investigation is two-fold: (1) to 
investigate individual differences in emotional 
reaction to the Rorschach plates as reflected 
by increased palmar skin conductance, and (2) 
to compare the relative ameunt of change in 
conductance accompanying each card to deter- 
mine if the cards differ significantly in “affec- 
tive value” as indicated by increased conduc- 
tance. 

In terms of Rorschach theory it would be 
predicted that the greatest increase in conduc- 
tance should accompany Card VIII, since it is 
the first completely chromatic card and since 
it is preceded by four completely achromatic 
cards. A considerable increase in conductance 
is also to be expected accompanying the fully 
colored Card IX, for, although the preceding 
card is also completely chromatic, the colors 
in Card IX “. . . are disagreeable and dishar- 
monious. They provide a special opportunity 
for persons disposed to react with excessive 
emotional outbursts” [4, p. 256]. A sharp in- 
crease in conductance is also to be expected ac- 
companying the heavily shaded Cards IV and 
VI. Fulfillment of these predictions would 
help to solidify the position of those Rorschach 
theorists who have that emotional 
stimulation is provided by the colored and the 
heavily shaded cards. 


insisted 


EXPERIMENTAL DESIGN AND PROCEDURE 


We were confronted with the problem of 
designing an experiment which would dupli- 
cate as closely as possible the normal Rorschach 
test situation, but which would at the same 
time include maximum control of all paramet- 
ers known to effect a change in conductance. 
Use of the galvanometer and other apparatus 
might conceivably affect the spontaneity of the 
subject’s responses, while a situation as free as 
that which usually exists during administration 
of the Rorschach would undoubtedly produce 
spurious galvanometric results. 

General Design. An essential feature of the 
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Rorschach test, particularly in the interpreta- 
tion of color responses [10], is that the cards 
are always presented in the same order [2]. 
The succession of achromatic and chromatic 
cards is so arranged as to “produce all the 
possible nuances of color reaction. . .” [12]. 
However, one of the most outstanding charac- 
teristics of GSR studies is adaptation, or dim- 
inution of the response upon repeated stimula- 
tion [5, 7]. In their word-association study 
Jones and Wechsler [11] found that the posi- 
tion of a word in a series effected a striking 
difference in results, due to adaptation. To 
counteract this effect and at the same time to 
preserve insofar as possible the conventional 
sequence of the cards, the order of presenta- 
tion was systematically rotated. Card I was 
presented first to Subject 1 followed in order 
by Cards II, III, et cetera. Card II was pre- 
sented first to Subject 2, followed in order by 
Cards, III, IV, et cetera, with Card I follow- 
ing Card X. This arrangement was followed 
consecutively, so that Card X was presented 
first to Subject 10, followed in order by Cards 
I, II, et cetera. Fifty subjects were employed, 
so that each card was presented five times in 
each of the 10 positions. In this way each card 
enjoyed the same overall adaptation effect, 
while the normal order of presentation was in 
part preserved. There was no prior determina- 
tion as to which S would be presented which 
sequence. Essentially, this ensured randomiza- 
tion. 

Subjects. There is some evidence to indicate 
characteristic sex differences in both Rorschach 
responses [12] and the GSR [9]. To intro- 
duce uniformity, only Rorschach-naive males 
were used. Recruited from the student body 
of the University of Alabama, they ranged in 
age from 17 to 38. In view of the fact that 
they were all university students, it is believed 
that they were roughly equated in intelligence. 
It was considered desirable to hold both age 
and intelligence fairly constant since there is 
some evidence for Rorschach 
patterns among age groups and among those of 
diverse I1Q’s [8]. The number of S’s, 50, was 
chosen as a multiple of 10. 

Apparatus. To eliminate possible effect on 
the GSR of muscular activity [24] involved 
in S’s handling of the card, the plates were 
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presented by means of an electrically driven 
Guhin card changer (Stoelting). The card 
changer was modified so that exposure time 
could be controlled by E with a silent switch. 
The plates were mounted on standard card 
changer cards in such a manner as not to in- 
terfere with the area of the plate normally 
seen by S. Since sharp, unexpected noises have 
been found to increase conductance [1], a 
turkish towel was folded in the bottom of the 
drawer of the card changer to deaden the 
sound of the falling card. 


The apparatus used for measurement of the GSR 
was the simple potentiometric circuit described by 
Lacey and Siegel [14]. The electrodes were the 
Darrow universal type and were filled with San- 
burn Redux paste. To assure uniformity of con- 
tact surface for all S’s, the palms were wiped with 
70 per cent alcohol before the electrodes were ad- 
justed. The unit of measurement used was change 
in conductance, since this unit has been shown to 
meet the two criteria of normality of distribution 
and independence of resting level [13]. Change in 
conductance calculated in the manner ex- 
plained by Lacey and Siegel [14]. 


was 


When §S entered the experimental room, a white 
cover card of the same size as the Rorschach card 
was visible on the card changer. S was seated in 
a comfortable chair directly in front of the card 
changer, which was placed at eye level about three 
feet in front of him. During the experiment the 
galvanic equipment and E were behind S’s line of 
vision. Illumination was afforded by an overhead 
light and a gooseneck lamp focused from behind 
S directly on the cards. 


Procedure. After the electrodes were ad- 
justed S was told to relax. The usual Ror- 
schach instructions [12, p. 32] were given, 
with the exception that S was told to give 
only one response to each card. This was done 
in the interest of uniformity. As a part of the 
effort to control voluntary movement on the 
part of S, the instructions were supplemented 
with a further statement to S to sit quietly 
and relax. It was believed that more detailed 
instructions regarding restriction of move- 
ment, coughing, et cetera, might cause S to 
be more tense than he might otherwise be. 
Such tension would introduce some error, 
since it is established that increased conduc- 
tance accompanies bodily tension [25]. 


After the instructions were read, the variable re- 
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sistor in the circuit was adjusted so as to pass a cur- 
rent of 40 microamperes through S. An adaptation 
period of approximately 90 seconds, or longer if 
necessary, was allowed for the microammeter needle 
to stabilize. So that the dropping of the first card 
would not startle S and thus confound the results 
[1], the statement was then made to S, “Now look 
at the card.” E then pressed the key dropping the 
cover card and revealing the first Rorschach plate. 
When S had given his verbal response and the 
meter needle had begun to return to the original 
level, the key was pressed again, dropping the 
Rorschach plate and revealing another plain white 
cover card. Time of presentation varied from plate 
to plate from S to S. It was desired to record the 
maximum galvanometric reading accompanying each 
plate. With some Ss the maximum was reached be- 
fore the verbal response was given, and with others 
after the verbal response was made. Latency of verb- 
al response varied, of course, as it does in the nor- 
mal Rorschach test situation. For each plate, resting 
level, maximum meter reading, and verbal re- 
sponse were recorded. 


After the Rorschach card was dropped, the 
needle was allowed to return to the original level. 
During this time the cover card concealed the next 
Rorschach plate. If after about two minutes the 
meter needle appeared to have stabilized above the 
previous resting level, indicating that more than 40 
microamperes were flowing through S’s branch of 
the circuit, the variable resistor was readjusted to 
impress the standard 40 microamperes through §, 
and the new basal resistance level was recorded. 

It was found necessary to remove the plates and 
cover cards from the card changer drawer after 
half of them had dropped so that those in the drawer 
would not interfere with the dropping of the cards 
still on the rod. A period of stabilization was then 
allowed in order to rule out any disturbing effect 
this may have had on S. 

Since diurnal variations have often been observed 
in the GSR [15], the experiment was run only be- 
tween the hours 9:30 and 11:30 a.m. No record was 
made of temperature or humidity, for a number of 
investigators have discounted the effects of these 
variables on palmar conductance [21]. Observation 
of the microammeter needle during the experiment 
and results obtained by other investigators [9] lead 
to the belief that the GSR is not affected by the 
slight muscular movement involved in making a 
verbal response. 


RESULTS 


Change in conductance was calculated for 
each of the 10 Rorschach cards for each S. The 
data were subjected to analysis of variance. Re- 
sults of this analysis are presented in Table 1. 

Mean changes in conductance among the 50 
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TABLE 1 

ANALYSIS OF VARIANCE 
Source of ee Sum of 2 Vari- Wes 
Variation df Squares ance F pe 
Total §é. .§ 499 36,100 
Cards 9 429 7 1.81 >.05 
Individuals 49 22,057 450 17.31 <.91 
Position in Series 9 1,152 128 4.92 <.01 
Remainder 11,461 26 


432 








*From Snedecor [23, p. 222-225]. 


S’s range from 0.8 microhms to 26.4 mi- 
crohms, with a mean of 12,1 and a median 
of 10.7. To determine whether the sample 
was normally distributed with respect to gal- 
vanic response to the Rorschach plates, a 7? 
test was applied to the individual means. The 
7% value obtained was 1.9, with a correspond- 
ing probability of .75*, for four degrees of 
freedom [23]. 

Mean changes in conductance accompanying 
each card are shown graphically in Figure 1. 
Figure 2 is a graphic presentation of mean 
changes in conductance for each of the 10 posi- 
tions. 


DISCUSSION 


Reference to Table 1 shows that the F ra- 
tio for individuals, 17.31, is highly significant, 
P < .01. Wide differences have been demon- 
strated in the “emotional” reaction of individ- 
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uals to the Rorschach cards as reflected in the 
GSR. That this parameter is normally dis- 
tributed is inferred from the fact that the * 
test did not violate the null hypothesis that in- 
dividual means are drawn at random from a 
normal population. 

In view of the low F ratio for cards, 1.81, 
P > .05, we must accept the null hypothesis 
that there is no significant difference among 
the Rorschach cards so far as their “affective 
value” (as indicated by change in palmar skin 
conductance) is concerned. While the results 
fail to offer statistically reliable evidence for 
differential “affective value” among the plates, 
it should be pointed out that the mean values 
obtained (see Figure 1) tend to support the 
view that Card VIII, in independence of posi- 
tion, presents the greatest “emotional” stimu- 
lus, at least insofar as this particular sample 
is concerned. The largest mean change in con- 
ductance accompanied this card. In this con- 
nection, mention must be made of results ob- 
tained by Rockwell e# al [19]. They found in 
the case of their 10 normal Ss who showed 
“color shock” (which they defined as a paucity 
of verbal responses) that there was a signifi- 
cant decline in the GSR accompanying Card 
VIII. This they had expected on the basis of 
their postulation that “color shock’’ is a con- 
dition of lowered excitability. This writer, on 
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the assumption that Card VIII possesses “‘af- 
fective value,” predicted that the greatest in- 
crease in the GSR would accompany this card. 
The prediction was verified. The apparent 
discrepancy between these results and Rock- 
well’s may inhere in the fact that Rockwell’s 
S’s were “specifically selected because they 
showed an inhibition of associations to at least 
one of the colored cards” [19, p. 140], where- 
as our Ss were student volunteers neither se- 
lected nor rejected on any basis. If a student 
volunteered his record was included, regard- 
less of what his Rorschach or galvanic respons- 
es proved to be. 


It will also be noted that the smallest amount 
of increase accompanied Card V. This, again, 
is in accord with Rorschach theory. In Ror- 
schach’s own words, Plate V is “The easiest 
form to interpret” [20, p. 52]. Klopfer and 
Kelley [12, p. 211] have found that this card 
is seldom rejected by S. Very little “affective” 
stimulation, then, would be expected from this 
card. 

Results do not support the expectancy that 
the heavily shaded cards would be accompanied 
by sharp increases in conductance. 


Although this study was not originally concerned 
with the effect of position of the cards in the series, 
the design of the experiment has necessitated a con- 
sideration of this factor. The high F ratio, 4.92, 
P < .01, indicates that this variable has an effect 
on change in conductance, at least so far as the 
sample investigation is concerned. It was certainly 
unexpected that more of the total variance would be 
due to position of the cards than to the cards them- 
selves. The decline in GSR from positions 1 to 3, 
shown in Figure 2, is of course explained and ex- 
pected as being due to adaptation to the experiment- 
al situation. The rise which begins with the fourth 
position would doubtless be explained by Rorschach 
theorists as an indication of an “accumulation of 
affect” in response to the cards as a whole. It is of 
particular interest that the eighth card to be shown, 
no matter which card it happened to be, was ac- 
companied by the highest mean change in conduct- 
ance. This fact could be used as a basis for counter- 
ing the Rorschach view that Card VIII per se has 
“affective value” with the claim that the eighth 
position in the series is a critical one. As already 
indicated, however, these data suggest that Card 
VIII does possess the greatest “affective value” re- 
gardless of its position in the series. Just why there 
should be a decline in GSR after the eighth card 
is not known. Ss had been told that they would be 
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shown 10 cards. It has been suggested that this de- 
cline represents relief that the experiment is nearly 
over. But are Ss glad to be rid of the cards them- 
selves or of the total situation? However, since the 
GSR values accompanying the ninth and tenth posi- 
tions represent changes in conductance and not basal 
level, it is believed that this explanation is not ade- 
quate. 


It must be emphasized that in any interpreta- 
tion of these data caution must be exercised in 
generalizing from the sample to the popula- 
tion, since the 432 d. f. associated with the re- 
mainder mean square in Table 1 are derived 
from 10 observations repeated on the same 50 
individuals. 

No effort was made in this investigation to 
correlate the Rorschach scoring or the latency 
of verbal responses with change in conductance. 


Such an analysis is planned for further studies. 


SUMMARY 


Record was made of change in palmar skin 
conductance accompanying each of the Ror- 
schach plates upon presentation, by means of 
a Guhin card changer, to 50 male college 
students individually. On the assumption that 
increased conductance is indicative of “affec- 
tive tone,” a finding of significantly greater 
increases accompanying the colored and the 
heavily shaded plates would serve to streng- 
then the position of Rorschach theorists who 
maintain that these cards possess “affective 
value” for the subject. To counteract the ad- 
aptation effect of the GSR and at the same 
time to preserve as nearly as possible the nor- 
mal Rorschach sequence, the order of present- 
ing the cards was systematically rotated. Sig- 
nificant individual differences in galvanic re- 
sponse to the cards were obtained, implying 
differences in “affective” reaction to the test 
as a whole. The sample was found to be nor- 
mally distributed. Change in conductance 
among the cards was not found to be signifi- 
cantly different; therefore, there is no statisti- 
cal evidence that the cards differ among them- 
selves in “affective value.” The experimental 
design introduced consideration of effect on 
the GSR of card position. Position in the se- 
ries was found to have a significant effect on 
change in conductance for the sample inves- 
tigated. 
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TWO SETS OF RORSCHACH RECORDS OBTAINED 
BEFORE AND AFTER BRIEF PSYCHOTHERAPY 


EDITH LORD 


MENTAL HEALTH DIVISION 
ARIZONA STATE DEPARTMENT OF HEALTH 


Foto the numerous unanswered or 
partially answered questions in the field 
of psychology today, is one question of ever- 
increasing interest: Exactly what goes on in 
the subject (i.e., the client or patient) during 
a successful psychotherapeutic process or a 
series of counseling interviews? 

A variety of answers has been given to this 
question ; however, the type of answer is usu- 
ally closely related to the particular hypotheses 
underlying the psychological theory, or the 
system of therapy, to which the speaker sub- 
scribes. In other words, an inquirer might be 
told that John Doe’s observable change in per- 
sonal and social behavior, following psycho- 
therapy, is the result of (a) the resolution of 
his oedipus complex, if the speaker were a 
Freudian, (b) relearning or reconditioning 
processes, if a Behaviorist or Pavlovian, (c) 
penetration of his character-armor, if a fol- 
lower of Reich, (d) achievement of self-con- 
sistency, if Lecky’s theses were dominant, and 
(e) self-acceptance, if the speaker adhered to 
Roger’s tenets, etc., etc. The explanation of 
what has gone on in psychotherapy is more 
often a revelation of the therapist’s philosophy 
or bias than it is a description or explanation 
of what actually occurred within the thera- 
pized subject. 

If an imprint of a personality could be made 
before therapy, and another imprint made of 
the same individual’s personality after therapy, 
a comparison of the two samples would give 
some basis for observing and describing chang- 
es, if any, in the personality structure. One 
might at least say that the changes in the 
records are concomitants of the therapeutic 
process. From the two sets of data, it should 
be possible to get some cues as to what has 
happened within the individual during the psy- 


chotherapeutic process. 

The Rorschach ink blot test is widely ac- 
cepted as one of the most acute measures of 
personality structure available. The Rorschach 
consists of ten ink blots which provide optical 
stimuli having color and ambiguous, or mean- 
ing-free form. The subject’s perceptual associa- 
tions and structured concepts related to these 
blots may be general and impersonal or may 
symbolize individual trends, forces, or drives 
at work deep within the personality. 

In any event the total protocol seems to give 
a keyhole peep of the perceptive and appercep- 
tive processes or components of the personality, 
the ways in which an individual perceives, 
structures, interprets his world. If this instru- 
ment does, in fact, measure personality struc- 
ture, then a person’s Rorschach record should 
change somewhat after successful psychother- 
apy, if such psychotherapy actually brings 
about an alteration in personality. For ex- 
ample, with observed changes in personal and 
social behavior, one might expect changes in 
the absolute numbers, the percentages, and the 
ratios of those Rorschach scoring symbols 
which are supposed to measure the way in 
which a subject utilizes his innate capacities 
and his manner of interaction with the exter- 
nal environment. 


METHODOLOGY 


Rorschach ink blot test records were ob- 
tained on two subjects before and after brief 
psychotherapy. Both subjects, a year after the 
initial contact, displayed observable behavioral 
changes in the direction of more adequate per- 
sonal and social adjustment, the criterion of 
successful psychotherapy. 

It is unnecessary in this study to relate in 
detail the psychological problems of the sub- 


134 


Die 


RORSCHACHS BEFORE AND AFTER PSYCHOTHERAPY 135 


jects and to recount the voluminous subject 
matter of the interviews. It is enough for the 
present purpose to summarize the subjects’ 
personality problems at the time of administra- 
tion of the first Rorschach test, before the psy- 
chotherapeutic interviews were initiated, and 
the behavior, with reference to the presenting 
problems, a year later at the time of re-examin- 
ation. 

Both subjects, one male and one female, 
were 25 years old at the time of the first con- 
tact and first administration of the Rorschach 
test. Both presented personality disorders suffh- 
ciently severe to interfere seriously with inter- 
personal relationships and to preclude voca- 
tional adjustment. 

The two subjects had a series of therapeutic 
interviews over a period of six months, follow- 
ed by a similar period of time during which no 
regularly scheduled interviews were held. The 
second Rorschach test was then administered, 
one year after the first. 


Subject A sought help on a single problem: 
He reported that since childhood he had con- 
sistently experienced complete speech blocking 
when with groups of two or more persons and 
occasionally when attempting to talk to even 
one person. His problem was not merely de- 
creased fluency, it was an actual inability to 
utter a word. The ceiling in his professional 
progress had been reached. He had few friends. 
With little optimism, he wanted to try psy- 
chotherapy as a last resort before resigning 
himself to a lonely life as a nonsocial, medi- 
ocre clerk. 

One year later he was talking regularly be- 
fore groups, usually with some anxiety, occa- 
sionally nonfluent, but with a six-months’ re- 
cord of no complete speech blocking. There is 
no intent to present a black-to-white picture of 
this personality. Numerous problems remained 
(and probably still remain) unsolved. Never- 
theless, his observed and reported behavior be- 
fore and after brief psychotherapy was dif- 
ferent. 

Subject B’s chief presenting problems were 
loneliness and job instability. She wanted to 
have close friends of both sexes, she reported, 
but had none of either sex. She felt guilty and 
embarrassed when with females. With males, 
she would experience visible trembling, plus 
anxiety-laden, sex-related thoughts. She was 


practically resigned to a friendless spinsterhood. 
Her work history was one of constant change 
as she left one position after another because 
of intolerable, interpersonal tensions. One year 
later she was an elected officer of a women’s 
club, and had several close female friends. Too, 
she had several male friends, one of whom she 
was contemplating marrying. Again, the tran- 
sition was by no means one from sickness to 
rosy mental health. Numerous unsolved con- 
flicts continued to disturb her interpersonal re- 
lationships ; however, the direction of develop- 
ment appeared altered: her observed adjust- 
ment was markedly improved, and her original 
despair was replaced with an optimism that 
eventually even better adjustment could be 
achieved. 


RESULTS 


Let us see how these behavioral changes are 
reflected in the before-and-after Rorschach 
protocols of the two subjects: The test records 
were analyzed in several ways: (1) by com- 
paring the paired psychograms; (2) by making 
qualitative comparisons based on usual inter- 
pretations of the Rorschachs; (3) by comput- 
ing the Buhler-Lefever Basic Rorschach Scores 
and determining the Integration Levels of the 
subjects on each of the tests; (4) by examin- 
ing the negatively and positively weighted 
components of the records which were present 
or absent in the two sets of data. 

When the four psychograms were mixed, 
persons of little or no experience with psycho- 
logical tools were able to separate and pair the 
graphs belonging to Case A and to Case B. 
There is a basic similarity in the before-and-af- 
ter psychograms of the two individuals; how- 
ever, considering the separate graphs of either 
pair, numerous marked differences are immedi- 
ately apparent. Changes in absolute numbers 
and in ratios and percentages of the scoring 
symbols resulted, on the re-examination, in a 
considerably modified pattern, but did not pro- 
duce an unrecognizable distortion of the orig- 
inal pattern. 

One may infer from these data that each 
personality maintained a separate consistency 
which was identifiably reflected in both the be- 
fore and after graphs of the personality test. 
At the same time, each personality altered 
noticeably, and these alterations, too, were re- 
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TABLE 1 
Suspyecr A: ALTERATIONS OF SCORING SYMBOLS 
Symbol Before After Increase Decrease 
M 1 6 5 
FM 3 6 3 
m 1 0 1 
k 1 0 1 
K 5 1 4 
FK 2 0 2 
Fe 3 4 1 
e(cF) 4 } 1 
Cc 3 4 1 
FOC 2 3 1 
CF ll 4 7 
Cc 2 l 1 
R 56 50 6 
T/R 26” 30” 4” 
Rt Ch 7 a > a 
Rtc 9 17 8” 
Fe% 36 40 4% 
Shading % 45 44 1% 
AG% 21 38 17 
P 2 5(-+-2) 3(+-2) 
To:de 8 :6 17:9 9:3 
Sum C 13 7 6 
M: Sum © 1:13 6:7 5:0 0:6 
FM+ m:e+C’ 4:10 6:9 2:0 0:1 
CR% 34 36 2 
Ww:M 4:1 5:6 1:5 
W% 7 10 3 
D% 27 34 7 
d% 9 20 11 
Dd + S% 57 36 21 
$S 4 0 4 





flected in the graphs. 

Case A: Among the most important modifi- 
cations in the records of Case A, (Table 1), 
according to usual interpretations of the scor- 
ing symbols [1, 2], are the increase in concepts 
consistently interpretated as a sign of inner 
adjustment or equilibrium, a capacity for the 
absorption of emotional stimuli whether origin- 
ating from within or without the personality. 
There is a decrease in the symbols which 
measure uncontrolled, impulsive emotional re- 
actions. A change in the ratio of whole-to-de- 
tail concepts indicates a diminution of a pica- 
yunish, hypercritical attitude. The increase in 
popular responses shows a development in 
thinking more in line with community thought. 
The marked reduction in the use of white 
spaces — that is, relating to background rather 
than to ground — suggests a reduction in re- 
sistance behavior, stemming from feelings of 
inadequacy. Perhaps the most important single 
modification is in the subject’s attention to 
larger areas of the blot stimulus rather than 
the original, almost exclusive, absorption with 
minute details of the stimuli. On a behavioral 


EDITH LORD 


level, this change would be reflected in the 
subject’s paying less attention to the nonessen- 
tial aspects of his environment and relating 
more readily to larger, more important aspects 
of the total external environment. 

Although the changes noted in the foregoing 
paragraph are in the direction of “normalcy,” 
neither the “before” nor the “after” protocol 
looks entirely healthy. One may ask: How sick 
was this subject? How much improvement is 
reflected in the differences between records? 
And how sick does he remain at the time of re- 
examination ? 

Subjectively, one may give qualitative aa- 
swers to these questions, based on the forego- 
ing interpretations. Applying a new technique 
to the data, the Buhler-Lefever “Basic Ror- 
schach Score” [1], this difference can be ex- 
pressed quantitatively by the assignment of 
statistically derived plus and minus weights to 
the various obtained subscores on the test. We 
find that a marked difference exists in total 
weighted scores of the before-and-after Ror- 
schachs. 

On the initial test, Subject A’s positive 
weights totaled 18, the negative weights to- 
taled 25, giving an algebraic sum of minus 
seven. This score would classify the sub- 
ject in Level III of the four Integration Lev- 
els introduced in connection with the Basic 
Rorschach Score. Level III is described as the 
Level of Impairment or Defect. On the reex- 
amination, Subject A’s record merited positive 
weights totaling 29 and negative weights of 14, 
resulting in a Basic Rorschach Score of plus 
fifteen, the upper limit of Integration Level II, 
the Level of Conflict. These different scores 
and levels, determined by scoring weights 
which were statistically established, give us a 
quantitative measure of Subject A’s person- 
ality before and after brief psychotherapy. The 
descriptive terms used to label the two differ- 
ent levels the subject reached on the Ror- 
schachs, — i.e., from Defect to Conflict—are 
terms which seem quite well to fit the subjec- 
tive evaluations of his earlier and later be- 
havior. 

The data thus far presented indicate that 
the Rorschach record does, in fact, alter as per- 
sonality dynamics alter. Using standard inter- 
pretative procedures, a qualitative description 
of these alterations has been made. Using 
statistically determined weights, the changes 


RORSCHACHS BEFORE AND AFTER PSYCHOTHERAPY 


are quantitatively expressed; and improve- 
ment in integration is reflected in a shift from 
Level III to Level II. The next step in the 
present inquiry is to discover precisely what 
differences in the two Rorschach protocols of 
Subject A contributed to the increases and 
decreases of the algebraic weights. We know 
to what extent the tests differed; we want to 
know how they differed. 


TABLE 2 
Supyect A: Necativety WeiGHTep COMPONENTS 
Present Berore, Assent AFTER, 
PsYCHOTHERAPY 


Item* Component Weight 

5 M=0tol — 3 
12 FM twice -+- M — 2 
18 2+ (k + K) — 1 
53 W :M=—3+:1 — 1 
64 S— 348+ == j 
71 P—4— — 2 
75 3 Repetitions —1 
77 1 Confabulation — 3 
81 Explosions — 1 


*Number of item on Buhler-Lefever “Diagnostic Ror- 
schach Sign List” [1, p. 11]. 


Interpretation of the weighted symbols re- 
veals that the following factors dropped out 
of the personality picture, (Table 2): Strong 
primitive drives that demanded immediate 
satisfaction and interfered with goal-defer- 
ment and with goal-directed personality organ- 
ization ; excessive anxiety, chiefly of the unfixed 
or “free-floating” kind; inability to perceive 
the environment or to think in line with com- 
munity thought ; a tendency toward stereotyped 
thinking; strong feelings of resistance toward 
the external environment; lack of internal 
equilibrium with particularly significant re- 
sults in relation to the subject’s incapacity to 
achieve on a level with his aspiration; explos- 
ive reactions to emotional stimuli. All of these 
facets dropped out of the personality picture. 

The following new personality components 
existed after brief psychotherapy (Table 3): 
Internal equilibrium; only as much anxiety as 
is found in the records of adjusted persons; a 
tolerant, noncritical attitude toward other 
human beings; thinking in line with communi- 
ty thinking; a favorable ratio between emotion- 
al reactions and integrated, goal-directed be- 


TABLE 3 
Supyect A: Posrrive Weicuts Assent Berore, 
Present Arter, PsycHoTuy apy 


Item Component Weight 
8 M—4-+ + 3 
17 § to 14 (k + K) + 1 
69 to :De— 2:1 + 1 
72 P==5 4 + 1 
94 3M + plus 3 SumC + r $s 
97 (14 + 17 + 20) + 2 

havior. 


Case B: Turning now to the case of Sub- 


ject B, it will be sufficient to present a sum- 


mary of the data gathered and proceed at once 
to an analysis of the measured changes in her 
two test records (‘Table 4). 


TABLE 4 
Susyect B: ALTERATIONS OF SCORING SYMBOLS 
Symbol Before After Chang 
M 9 11 up 2 
FM 2 3 up 1 
m 4 0 down 4 
F 19 18 down 1 
Fe 6 4 dow? 
CoC’ 0 2 ip 
FC 3 4 up 1 
CF 0 l up | 
T/R 31” 19” down 12” 
Rt ch. 7 4” down 3” 
Rtc g” 6” ce 
FM% 4% 1% up 3% 
F% 42% 40% down 2% 
FK + F + Fe/R 60% 49% down 11% 
A% 29% 27% down 2% 
P 4 5 (+-1) up 1(-+-1) 
(H + A):(Hd-+- Ad) 15:16 18:18 34-:2 
Sum C 1.5 3:0 up 1.5 
M : Sum C 9:1.5 11:3 2+-: 1.54 
(FM + m):(Fe + c+ C’) 6:8 3:8 3 0 
w:M 5:9 §:11 
d% 22% 20% down 
Dd + S8S% 27% 29% up 2% 
tofS 0 (+2) 1(4+2) upl 


Subject B achieved a Basic Rorschach Score 
of minus 5 before therapy, plus 22 after thera- 
py. She changed from Integration Level III 
to Level I, from the Level of Impairment 
or Defect to the Level of Adequacy. 

Following are the negative components 
which dropped out of the personality picture 
(Table 5): A tendency to dwell too long on 
one concept or thought; excessive tensions, 
severe repression of emotionality ; failure to re- 
act to all the stimuli in the environment; 
thinking out of line with community thinking; 
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TABLE 5 
Susyect B: Necative Weicuts Present Berore, 








Item Component Weight 
4+ T=—30" + —2 
15 3+m — 1 
33 C’ absent —2 
40 Sum C = 24 — — 3 
71 P=—4— —2 
$3 3 + At — 1 
93 3 + dead, injury, etc. —2 





preoccupation with bodily processes; and ex- 
cessive tendencies toward morbidity. 

Components of the personality picture which 
were absent before but present after psycho- 
therapy are the following (Table 6) : Capacity 
for tension tolerance; increased intellectual 
control; sensitivity functioning as an aid in 
tactful social relationships rather than as an 
incentive to self-pity; reaction to all possible 
stimuli from the environment; thinking in line 
with community thinking; emotional respon- 
siveness; and a favorable ratio between emo- 
tional reactions and integrated, goal-directed 
behavior. 


TABLE 6 
Susyect B: Postrrve Weicuts Assent Berore, 


Item 








Weight 
14 1 to 24m + 2 
23 21 to 40% F% + 3 
29 Fe=1 to 54 + 2 
34 C’ present +2 
41 Sum C= 3+ + 3 
72 =—5+ +1 
80 Blood, fire, smoke +1 
94 M=3-+ and Sum C= 3+ 3 


| 
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DISCUSSION 


Both sets of test data reflected some basic or 
unchanged negative and positive personality 
factors. Both sets also contained evidence that, 
concomitant with the therapeutic process, some 
negatively weighted personality factors drop- 
ped out of the records and some positively 
weighted factors emerged. One may infer, 
therefore, that the Rorschach test actually does 
reflect both constant and changed personality 
components. 
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The records of both subjects had only two 
altered features in common: (1) a shift from 
thinking that is out of line with community 
thinking to thinking that is in line with com- 
munity thought, and (2) the development of 
a favorable ratio between emotional reactions 
and integrated, goal-directed behavior ; or, dif- 
ferently worded, the achievement of a health- 
ier balance between intellectual and emotional 
components of the personalities. ‘The numer- 
ous other additions and subtractions to the 
personality pictures were apparently disparate 
functions of the unique configuration of each 
personality. 


CONCLUSIONS 


While generalizations cannot be made from 
data obtained from so few records, one may 
present the following hypotheses, to be tested 
by future research: 

1. Basic personality configurations remain 
recognizably constant despite successful brief 
psychotherapy. 

2. Measurable personality changes do occur 
within subjects as a concomitant of brief psy- 
chotherapy. 

3. Both positive and negative components 
of the personality persist during and after 
brief psychotherapy. 

4. Both the constant and the variable fac- 
tors are reflected in Rorschachs administered 
before and after therapy. 

5. Personality changes occurring with suc- 
cessful therapy possibly consistently include: 

(a) a more adequate inner balance be- 
tween intellectual and emotional personality 
components. 

(b) . the emergence of thinking processes 
that are in line with community thought. 

6. Other changes within the subject prob- 
ably vary with, and are determined by, the 
unique personality structure of each individual. 

In addition to the hypotheses growing out 
of the present study, there is the hope that the 
method herein employed will serve as a stimu- 
lus for additional research of this sort in a 
continued effort to discover, more and more 
precisely, what actually or consistently occurs 
within subjects during psychotherapy. Too, 
with minor modifications, this method could 
well be employed in much-needed validity 
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studies of this sensitive psychological tool — 
the Rorschach test. 


Received June 23, 1949. 
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NONDIRECTIVE PLAY THERAPY WITH 
RETARDED READERS' 


ROBERT E. BILLS 


UNIVERSITY OF KENTUCKY 


HE problem of the retarded reader has 

caused educators much concern and has 
stimulated much research. Retardation in 
reading is now thought to be the result of poor 
teaching or of the inability of a child to learn 
by customary instructional procedures; conse- 
quently the retarded reader is usually given 
enriched reading instruction of an individual 
and remedial nature which is designed to teach 
him what he has failed to learn in the usual 
way. 

Often remedial reading instruction proves 
to be valuable and the child does learn to read 
with skill commensurate with his ability. 
There are probably many causes of the gains 
in ability resulting from such individual in- 
struction. Many retarded readers, however, 
fail to improve in spite of such instruction. 
Clinical experience has shown that some mem- 
bers of this group may learn to read when the 
remedial practice is preceded or accompanied 
by individual psychotherapy. 

Lecky [6] and Axline [2] have postulated 
that poor reading may result from inconsis- 
tencies in the attitudinal system of a child or 
from difficulty in resolving a conflict between 
a concept of self as a poor reader and a con- 
cept of self as a good reader. 

If the difficulty which some retarded read- 
ers show is due mainly to inconsistencies with- 
in their value systems, and if nondirective play 
therapy can aid the individual in changing his 
attitude toward self or in re-evaluating his con- 
cept, then corresponding changes should occur 
in subject matter ability after nondirective 


1A condensation of portions of a doctoral project 
submitted to the Advanced School of Education, 
Teachers College, Columbia University in 1948. The 
author is deeply grateful to Professors Nicholas 
Hobbs, Ruth Strang, and Helen Walker for their 
encouragement and constructive criticisms. 


play therapy. The following experiment was 
designed to test the hypothesis that significant 
improvements occur in the reading ability of 
a retarded reader when he has been given a 
nondirective play therapy experience. 


THE DESIGN OF THE INVESTIGATION 


Selection of the class. Through the coopera- 
tion of the Bureau of Reference, Research, and 
Statistics of the New York City Board of Ed- 
ucation and a principal of an elementary 
school in New York City, permission was ob- 
tained to work with a class of third-graders 
who had been previously classified as slow 
learners. It must be emphasized that the chil- 
dren included in this study were in this third- 
grade class because of an inability to learn at 
a normal rate and not because of intellectual 
or emotional factors. Intellectual or emotion- 
al factors may have caused these children to 
learn slowly, but they were not the criteria for 
placement in this class. 


It is to be expected, because of their presence in 
this class, that the children in the study would be 
low in achievement and would exhibit some retar- 
dation in school, but it does not necessarily follow 
that these children would exhibit difficulty in emo- 
tional adjustment to a greater degree than a group 
of children not classified as slow learners. 

At the beginning of the study twenty-two chil- 
dren were in the class. Of these, two were of foreign 
birth and had English language handicaps of such 
severity as to render their scores on standardized 
tests meaningless, and two others left the school 
district before the study was completed. Final test 
data are included for the remaining eighteen chil- 
dren. 


The three periods of the study. The plan of 
the investigation included three periods of thir- 
ty school days each. The first of these periods 
was the control period, the second the therapy 
period, and the third was considered as a per- 
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iod for noting lasting or cumulative effects of 
the therapy. It was thought that the most ade- 
quate comparison which could be obtained 
would be made between the gain of each child 
during the therapy and third periods and his 
gain in the control period. Each of the therapy 
children could have been matched with other 
children on the basis of certain objective fac- 
tors such as sex, age, grade, and intelligence, 
but it is far from certain that these are the im- 
portant factors in the problem. 


Employing the children as their own control 
could possibly introduce certain invalidating features 
such as the instruction given in the different periods, 
the health of the child, home circumstances, the 
child’s motivation, and school attendance. It is 
reasonable to expect that the influences operating on 
the therapy group during the control and therapy 
periods would be as constant as the influences oper- 
ating on the therapy group and a selected control 
group during any one period. It was assumed, there- 
fore, that the children could be matched with them- 
selves during two periods of the experiment with a 
greater degree of control of important personality 
variables than if they were matched with another 
group of children on the basis of objective criteria. 


The tests employed. Three types of test 
were used in the selection of the therapy group 
and in measuring the gains of this group and 
the rest of the class during the three periods of 
the study. They were: (1) the Gates test of 
paragraph meaning, (2) The Gray Oral Read- 
ing Paragraphs, and (3) The Revised Stan- 
ford-Binet, Form L. 


On the first day of the control period all of the 
children were tested with the oral and silent reading 
tests. During the next two weeks they were tested 
with the intelligence test. At the end of the control 
period all of the children were retested with the 
oral and silent reading tests, and these tests were 
given again at the end of the therapy and third 
periods. 


Selection of the therapy group. The Gates 
Primary Reading Tests of paragraph meaning 
at the beginning of the control period gave a 
reading grade score and a reading age equiva- 
lent for each child and the intelligence test 
gave a mental age. Any discrepancy between 
mental age and reading age was taken as a 
measure of reading retardation. The four 
children who showed the greatest discrepancy 
were chosen as part of the therapy group. Since 
these children had very high intelligence quo- 


tients the other four children were selected 
from the group with discrepancies and approx- 
imately average intelligence quotients in order 
to determine not only the effect of the experi- 
ence on very intelligent children but also the 
effect on four children of average intelligence. 
These eight children made up the therapy 
group. The remaining children in the class 
may be thought of as a comparison group but 
the inadequacies of this group for comparison 
purposes must be kept in mind. 


Treatment of the therapy group. After the 
testing with the Revised Stanford Binet noth- 
ing was done with any of the children during 
the control period. As has already been stated, 
all of the children were retested with the read- 
ing tests at the end of this period. 


During the second or therapy period, the eight 
children of the therapy group were given a play 
experience of a nondirective therapeutic nature fol- 
lowing the principles established by Rogers [8], 
Axline [2], and others. For the first three weeks of 
this period the children met in individual sessions 
with the experimenter and during the last three 
weeks each child attended an individual session 
and one group meeting each week. In all there were 
six individual sessions and three group meetings in 
a period of six weeks for five of the children, two 
of the children had six individual meetings and two 
group contacts, and the remaining child had four 
individual meetings and one group contact. Each 
session lasted forty-five minutes. All of the children 
in the class were retested at the end of this period 
with the oral and silent reading tests. 


Nothing was done with the children during the 
third period. At the end of this time all of the 
children were tested for the fourth time with the 
oral and silent reading tests. 


The collection of data. The collected data 
were of six types including: (1) the results of 
the standardized tests, (2) the recordings of 
the play sessions, (3) the school records, (4) 
the ratings of the judges, (5) the observations 
of the Binet examiner, and (6) observations 
of the reading instruction. The first of these 
has already been discussed. 


Each of the play sessions was recorded by means 
of a wire recorder and notes made by the worker, 
and these were used to supplement each other in the 
formulation of a verbatim transcription of what oc- 
curred during the individual play sessions. It was 
from these recordings that the judges made their 
ratings. 


The school records supplied background informa- 
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tion on the children, attendance records, and records 
of previous tests. 


The three judges employed here were well quali- 
fied.2 They had satisfactorily completed supervised 
counseling experience and were graduate students 
in the final stages of completion of their doctoral 
work. In addition each judge had several years of 
experience in dealing with psychological problems 
both in and out of clinical situations. Each judge 
adhered to a different therapeutic school, including 
eclectic, psychodynamic, and nondirective views. 


The observations of the Binet examiner consisted 
of the subjective impressions which each child gave 
during the testing situation. This description in- 
cluded mainly those points which aided in under- 
standing the performance of the child. The Binet 
examiner was well qualified and had considerable 
experience in testing children of the age range in- 
cluded in this class. The experimenter conducted no 
tests because of the possibility of biasing the data. 

It was believed that any gains which the 
children made during the different periods 
might possibly be accounted for by differences 
in the methods of the teacher, a change in em- 
phasis given to individual children, a remedial 
approach by the teacher, or other factors. To 
determine if such factors might be at work, the 
experimenter made several observations during 
each period to determine the constancy of the 
instruction and the emphasis given each child. 
In the third period of the investigation the 
writer and two well qualified teachers made 
three independent observations of the reading 
class. The three observers had considerable 
experience in teaching, amounting in two cases 
to seven years and in the third to four years, 
and two of the observers had experience in ad- 
ministration and supervision. These observers 
concluded that the instruction was equal for 
all members of the class, that it was not reme- 
dial in nature, and that it was constant for each 
period of the experiment. 


THE GROUP DATA 


The data of this study were obtained in two 
ways; by an intensive study of eight individu- 
al children, and by the study of the children 
as a group. Because of limitations of space the 
results of the study of the eight children as 
individuals are not presented in this paper.* 


2The author is indebted to Miss Ruth Witty and 
Messrs. Leon Gorlow and Ija Korner for their serv- 
ices as judges. 


The group data are presented below. 


The comparison group. Since the intelligence 
and reading tests were given to the entire class, 
test data are available for ten children who 
were not included in the play therapy exper- 
ience and who make up the comparison or non- 
therapy group. It must be emphasized that the 
therapy and nontherapy groups were not 
matched and that statistical comparison of the 
two groups is impossible. 

The two most obvious differences between 
the therapy and nontherapy groups were the 
factors of intelligence and reading retardation. 
The average intelligence quotient of the thera- 
py group was 130 and of the nontherapy group 
was 95. All of the children in the therapy 
group showed a negative discrepancy between 
mental age and reading age but only three of 
the children in the nontherapy group showed 


TABLE 1 
CHRONOLOGICAL AGE, MENTAL AGE, AND INTELLI- 
GENCE QUOTIENT OF EACH CHILD 


Chronological Mental 


Name Age* Age IQ 
THERAPY GROUP 
1. Nancy 8-1 8-0 99 
2. Jean 8-0 8-2 102 
3. Bernice 7-11 9-9 123 
4. George 8-7 9-6 111 
5. David 8-7 12-8 148 
6. Janice 7-4 10-10 148 
7. Mary 7-11 11-8 147 
8. Jack 9-3 14-9 159 
NONTHERAPY GROUP 
9. Dorris 8-11 7-4 82 
10. Harry 9-6 6-8 70 
11. Emily 8-3 8-0 97 
12. Lester 8-4 7-8 92 
13. Bob 7-10 9-9 124 
14. Grace 9-1 7-10 86 
15. Sibil 8-7 7-4 85 
16. Roberta 8-0 8-11 111 
17. Christine 7-10 9-11 127 
18. Wally 8-8 10-8 123 





*The chronological age is the age as of the date of 
testing with the Revised Stanford Binet, Form L. Ages 
are given in years and months. 


’This information is given in detail in R. E. Bills, 
An investigation of the effects of individual and 
group play therapy on the reading level of retarded 
readers. Unpublished Doctor’s Project, Teachers 
Col., Columbia Univ., New York, 1948. 
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this discrepancy. 

The data for the nontherapy group are use- 
ful in showing the gains of this group during 
the three periods of the study and the relative 
equivalence of the tests employed. They have 
little value as a means of comparison with the 
gains of the therapy group. 


Test results. The chronological age, mental 
age, and intelligence quotient of each child in 
the class are given in Table 1. The range of 
chronological ages for the entire class was from 
7-4 years to 9-6 years, the range of mental ages 
was from 6-8 years to 14-9 years, and the in- 
telligence quotients ranged from 70 to 159. 

The mental age and reading age for each 
child, and the results of subtracting the read- 
ing age from the mental age are given in Table 
2. Using the assumed criterion of reading re- 
tardation eleven of the eighteen children in the 


TABLE 2 


DisCREPANCY BETWEEN READING AGE AND MENTAL 
Ace ror Eacu CHILp* 


Reading Mental 





" Discrep- 





Name Age Age ancyt 

THERAPY GROUP 

1. Nancy 7-3 8-0 — 0-9 

2. Jean 7-7 8-2 — 0-7 

3. Bernice 7-5 9-9 — 2-4 

4. George 8-2.5 9-6 — 1-3.5 

5. David 9-0.5 12-8 — 3-7.5 

6. Janice 8-7 10-10 — 2-3 

7. Mary 8-7 11-8 — 3-1 

8. Jack 8-9.5 14-9 — 5-11.5 
NONTHERAPY GROUP 

9. Dorris 7-4 7-4 0-0 
10. Harry 7-3 6-8 + 0-7 
11. Emily 8-5 8-0 + 0-5 
12. Lester 8-8.2 7-8 + 1-0.2 
13. Bob 9-0.5 9-9 — 0-8.5 
14. Grace 8-7 7-10 + 0-9 
15. Sibil 8-9.5 7-4 + 1-5.5 
16. Roberta 9-3.5f 8-11 + 0-4.5 
17. Christine 9-0.5 9-11 — 0-10.5 
18. Wally 8-7 10-8 — 2-1 





*Reading age was determined from the first test of 
paragraph meaning of the Gates Primary Reading 
Tests. 

+This discrepancy is obtained by subtracting mental 
age from reading age. The figure is given in years 
and months or fractions of a month. 

tThis is the highest score obtainable on this form of 
the test. 


class were classified as retarded readers, one 
child showed a reading age equivalent to his 
mental age, and the remaining six children had 
reading ages greater than their mental ages. 


The grade scores of each child on the four 
tests of silent reading, and the change in score 
for each child in the three periods of the study 
are given in Table 3. The four tests included 
in this table showed that during the control 
period five of the eight children in the therapy 
group showed an increase in grade score, 
while only one of the ten children in the non- 
therapy group showed an increase. In the 
therapy period, all eight of the children in the 
therapy group and nine of the ten children in 
the nontherapy group showed increases in 
grade score. In the third period of the study 
six of the children in the therapy group and 
five of the children in the nontherapy group 
showed increases in grade score. 


The average gain of the therapy group may 
be given a statistical treatment. ‘The hypothesis 
to be tested is that the mean of the gains of the 
therapy group in the second period of the study 
was not significantly different from the mean 
of the gains of the same group in the first per- 
iod of the study. The results of the statistical 
computations are given in Table 4. These re- 
sults show that the gains of the therapy group 
in the first and second periods of the study 
were significantly different and that the ther- 
apy group made a significantly greater gain in 
the therapy period than it did in the control 
period. (These results are significant at the 
.001 level of probability.) 

The hypothesis may also be tested that there 
was no significant difference between the mean 
of the gains of the therapy group in the con- 
trol period and the mean of the gains for this 
same group, measured by the difference in 
grade score of the second and fourth tests. The 
results of the statistical computations are pre- 
sented in Table 5. The statistical treatment 
shows that there is no evidence to assume that 
this hypothesis is correct. It is apparent that 
the gains of the therapy group during the sec- 
ond and third periods of the study were signi- 
ficantly greater than the gains during the first 
period of the study. (These results are signi- 
ficant at the .01 level of probability.) 
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TABLE 3 
GRADE SCORE ON THE GATES PRIMARY AND ADVANCED 
PRIMARY READING Tests FoR EACH CHILD 


Name T° tT ‘T.—T,* T,* T,—T, TY? T.-T, T,—T: 
THERAPY GROUP 
1. Nancy............ 1.95 1.95 .00 2.75 80 2.80 + .05 + .85 
ie UE cntecens 2.30 2.40 + .10 3.75 +-1.35 4.00 + .25 + 1.60 
$. Bernice....... 2.10 2.20 + .10 3.10 + .90 3.50 + .40 +1.30 
4. George........... 2.75 2.30 — 45 2.75 + 45 2.60 — 15 + .30 
i ae 3.55 3.35 — .20 4.00 - .65 6.00 + 2.00 +-2.65 
a Sees 3.10 3.35 + .25 5.40 +-2.05 5.40 .00 +-2.05 
See 3.10 3.35 + .25 4.80 + 1.45 5.40 + .60 +2.05 
i sere 3.35 3.55 + .20 4.00 + .45 5.40 + 1.40 +1.85 
NONTHERAPY GROUP 
eS 2.00 2.00 .00 2.20 .20 2.80 + .60 + .80 
a 6a... 1.95 1.65 — .30 1.60 — .05 1.70 + .10 + .05 
ii. Emily... 2.90 2.90 -00 3.35 - .45 3.20 — .15 + .30 
i eee 3.22 3.55 + .33 3.70 mr 15 3.30 — 40 — .25 
a | 3.55 3.22 — .33 3.50 + .28 3.80 + .30 + .58 
a Se 3.10 3.10 -00 3.30 + 20 3.50 + .20 + 40 
i aS 3.35 2.75 — .60 3.20 tr 45 3.30 + .10 + .55 
16. Roberta........... 3.75 3.35 — .45 3.50 + .15 3.50 00 + .15 
17. Christine........ 3.55 3.35 — .20 3.50 + .15 3.30 — .20 — .05 
i). re 3.10 2.75 — .35 3.00 a5 2.90 — .10 + .15 


®*Grade score on the first test of paragraph meaning. Form 1, Type III of the Gates Primary Reading Tests 
was used to obtain this score. 


bGrade score on the second test of paragraph meaning, Form 2, Type III of the Gates Primary Reading 
Tests was used to obtain this score. 


eThis is the discrepancy between the first and second tests. This is given in terms of school grades and frac- 
tions of school grades. 


4Grade score on the third test of paragraph meaning. Form 1, Type II of the Gates Advanced Primary 
Reading Tests was used to obtain this score. 


*Grade score on the fourth test of paragraph meaning. Form 2, Type II of the Gates Advanced Primary 
Reading Tests was used to obtain this score. 


It has already been stated that the children gained in oral reading ability. 
were tested with the Gray Oral Reading Par- The ratings of the judges. To insure uni- 
agraphs at the beginning of each period of the formity, the ratings of the judges were obtained 
study and after the third period. The scores by means of questionnaires. Jean, Bernice, 
for each child in the class and the changes in David, and Janice were rated by all three 
score for each period of the study are given in judges as exhibiting significant emotional mal- 
Tables 6 and 7. It was noted that these gains adjustment. Nancy and Jack were rated by two 
agreed, in general, with the gains measured by of the judges as showing significant emotional 
the tests of silent reading. During the control maladjustment, while one judge felt that they 
period, six of the eight children in the therapy were adequately adjusted. Mary was voted 
group and eight of the ten children in the non-_ well adjusted by two judges and by the third 
therapy group improved in oral reading abili- judge as exhibiting significant social malad- 
ty. In the therapy period five of the children justment. George was rated by all three judges 
in the therapy group and two of the children as being well adjusted. The judges were 
in the nontherapy group showed gains in oral agreed that Nancy, Jean, Bernice, David, and 
reading ability. During the third period, all Jack had gained in emotional adjustment as 
eight of the children in the therapy group and a result of the play therapy experience. ‘Iwo 
four of the children in the nontherapy group of the judges were agreed that George and 
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TABLE 4 
THe COMPARISON OF THE DIFFERENCE OF THE MEANS OF THE GAINS OF THE THERAPY GROUP IN 
SILENT READING IN THE First AND SECOND Periops oF THE STUDY 


Ee, T,” T,—T, 


~_ \.. T,° T,-T, d‘ d? 
Nancy........-<... 2.75 1.95 .80 1.95 .00 80 -6400 
Jean........... 3.75 2.40 1.35 2.30 10 1.25 1.5625 
Bernice...... 3.10 2.20 .90 2.10 10 80 6400 
Ee 2.75 2.30 45 2.75 —.45 .90 .8100 
sin etnesitniiictiies 4.00 3.35 65 3.55 —.20 85 7255 
Janice....... nit 5.40 3.35 2.05 3.10 25 1.80 3.2400 
en 4.80 3.35 1.45 3.10 25 1.20 1.4400 
) en 4.00 3.55 45 3.35 .20 25 0625 
M,—=1.01 M,—.03 2d=—7.85 2d?—9.1175 
16.125 Degrees of freedom — 7 
t > too 
P< .001 


*T, is the grade score on the third test of paragraph meaning of the Gates Reading Tests. 


»*T, is the grade score on the second test of paragraph meaning of the Gates Reading Tests. 


©°T; is the grade score on the first test of paragraph meaning of the Gates Reading Tests. 


4d is a difference and is obtained by applying the formula: (T:s—T:) (T:—T;). 


Mary had not gained in emotional adjustment, 
and two were agreed that Janice did gain. 


THE CONCLUSIONS AND IMPLICATIONS 


The data of this study indicate that some 
factor or factors were operative during the 
second and third periods of the study which 


caused the reading gains of the children du- 
ring these periods to be greater than the gains 
which they showed in the control period. On 
the basis of the experimental design, the study 
of the individual child, the observations of the 
reading instruction, the study of the interview 
transcriptions, and all other available infor- 


TABLE 5 
COMPARISON OF THE DIFFERENCE OF THE MEANS OF THE GAINS IN SILENT READING OF THE THERAPY 
GROUP IN THE SIx WEEKS OF THE CONTROL PERIOD AND THE TWELVE 


Rennie Cer T,* Te-T, T° T,-T, 2(T,-T,)* d® d? 
Nancy......... : 2.80 1.95 85 1.95 .00 .00 85 .7225 
(a 4.00 2.40 1.60 2.30 10 .20 1.40 1.9600 
Bernice...... 3.50 2.20 1.30 2.10 10 20 1.10 1.2100 
George............ 2.60 2.30 30 2.75 oon os —.90 1.20 1.4400 
ae 6.00 3.35 2.65 3.55 —.20 —.40 3.05 9.3025 
Janice... 5.40 3.35 2.05 3.10 25° 50 2.55 6.5025 
_ 5.40 3.35 2.05 3.10 25 50 2.55 6.5025 
ee 3.55 1.85 3.35 .20 40 2.25 5.0625 
M,—=1.58 M,—.06 2d—14.95 2d2—32.7025 
t=— 5.241 Degrees of freedom — 7 
to, — 3.499 
t 991 — 5.405 


01 > P > .001 


*T, is the grade score on the fourth test of silent reading. 


>T; is the grade score on the second test of silent reading. 
°T; is the grade score on the first test of silent reading. 


“The quantity (T:—T:) is multiplied by 2 to make the six week gain of T:—T: comparable to the twelve 


week gain of T:—T>. 


ed” is obtained by applying the formula (Ts—T:) — 2(T:—T:). 
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TABLE 6 
Time AND Error Score oF EACH CHILD ON THE First AND SECOND 





Name Set I* Set II 
THERAPY GROUP 
RM riciesctndhidaeistincene 60-12° 
| Eee a 35-2 
SE a 50-5 
Ee 50-5 
EES ae 65-0 
TI ssectsiciccttttictanitien 70-2 
i ianissinscnihiebittiti 75-4 
SUI ipictidscishauinsnsiiadimmendee 70-2 
NONTHERAPY GROUP 
I tiiedaiteipiinihapetainatcs 40-5 
Ne é 
i cckcrdtveeninaphasaichink 45-4 
a 80-7 
a SR Ee 75-2 
| ESS 110-9 
St Ltnstictentenaiionbedainais 95-11 
SI cnssaicnnsiiiaieccntaaislons 90-0 
> wee 65-3 
eee 75-3 


Second Test 
Set I Set II T,.—T,° 
50-1 + 
20-1 t- 
45-5 -t- 
35-1 — 
70-1 —- 
55-0 + 
75-2 a 
80-2 _ 
25-2 65-10 + 
d 0 
35-0 55-1 + 
75-4 + 
70-1 + 
115-7 + 
85-8 + 
75-5 — 
60-0 + 
70-1 + 


“The Gray Oral Reading Paragraphs are in four sets of increasing difficulty. Each set has five equivalent 


forms. The sets which are reported under each test period are those which suited the ability level of the child. 


>The gain during the period betweent test 1 and test ‘ 


©The first number is the time in seconds required to read the selection and the second number is the number 


of errors made. 
4Child could not read selection. 


*The increase in errors is here considered to overbalance the decrease in time. 


mation, it appears that the play therapy ex 
perience which the children received could ac- 
count for their changed reading skill. It may be 
concluded that when the eight retarded read- 
ers of this study received a nondirective play 
therapy experience, they showed significant 
gains in their reading ability. 

The design of this study does not permit a 
conclusion on the effect of maladjustment on 
the reading ability of a child, but the results 
do suggest directions for future research. 
Future investigation might well take the form 
of an inquiry into the effects of play therapy 
on the reading level of children who show ade- 
quate emotional adjustment. If it can be shown 
that children who are making adequate emo- 
tional adjustment do not gain significantly in 
reading ability as the result of such play ex- 
perience, then it can be assumed that the 
reading gains noted in the present study were 
possibly the result of treatment of the malad- 


justment which the child evidenced. If, though, 
well-adjusted children did gain in reading 
skill as the result of the play therapy experi- 
ence, maladjustment should be regarded as ir- 
relevant to the changes recorded in the present 
study. 

The source of the reading gains which the 
children made in this study is in need of fur- 
ther investigation. It is possible that one of two 
things occurred: (1) these children were able 
to learn at a more rapid rate when they had 
received the play therapy experience, or (2) 
the gains which the children showed in read- 
ing skill resulted from information which the 
child already possessed but was unable to util- 
ize with maximum effectiveness. Both of these 
alternatives are possibly true, but the size of 
the gains recorded in this study and the length 
of the study lend weight to the second inter- 
pretation. In a longer study the first possibility 
might assume more significance. 
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TABLE 7 
TIME AND Error Score oF EACH CHILD ON THE THIRD AND FourTu 
TESTS WITH THE GRAY ORAL READING PARAGRAPHS 


Third Test 


Name Set I Set II Set III 
THERAPY GROUP 
| TEES a §9-3¢ 
eee 21-0 98-3 
Se 46-3 
RN icninicnsntaliaisismtial 32-2 
Ee 61-0 74-1 
BE isciianqneslinivnnanes 59-1 74-4 
| TESTE Pee 68-0 88-4 
| _ ERS 62-0 66-1 
NONTHERAPY GROUP 
ee RIP a. 35-3 e 
SS ee e 
Emily...... dasa 33-2 e 
Se iciandicinnaeinvcittinniadbdaai 80-1 
oe 56-2 74-9 
ee 136-12 
Sibil...... ss a 82-8 
EE 67-3 
Ee 57-2 
, | cn 63-5 


Fourth Test 
T,— T,* SetI SetII Set III T,—T,° 

— 42-3 + 
+ 100-2 + 
+ 35-2 + 

04 25-1 + 
+ 61-1 + 
-— 71-2 + 
+ 75-1 -} 
+ 56-1 + 
—- 27-2 fo 
0 e 0 
--- 29-1 7 
T 77-4 —_ 

04 72-7 + 
-- 130-8 + 
0 91-9 _- 
+ 77-4 —_— 
— 56-5 _— 


— 66-6 


"The gain in the therapy period or between test 2 and test 3. 


>The gain in the third period of the study between test 3 and test 4, 


©The first number is the time in seconds required to read the selection and the second number is the number 


of errors made. 


“The increase in errors and the decrease in time are considered to counterbalance, 


*Child could not read selection. 


The data also indicate that the gains in 
reading ability which the retarded readers of 
this study made appeared immediately after 
therapy for some of the children and after a 
short period following therapy for the others 
who exhibited a gain. Further investigation is 
needed to determine if there is a relationship 
between the rapidity of appearance and the 
size of the reading gain following nondirective 
play therapy and the extent of the emotional 
disturbance which some retarded readers show. 

This study was concerned with the gain in 
reading ability which a child shows following 
a nondirective play therapy experience. It is 
possible that such an experience may cause 
changes in abilities other than reading ability. 
There is a need for an investigation of the ef- 
fects of a nondirective play therapy experience 
on the abilities of children who are classified as 
retarded in other subjects included in the 


school curriculum. 

The conclusion that personal change fol- 
lows nondirective play therapy is certainly not 
unique to this study. Many other studies have 
shown that this is the result to be expected. 
This study adds to the body of information 
on the length of therapeutic treatment neces- 
sary to produce personality changes. It has 
been shown that, as a result of six individual 
and three group play therapy sessions, person- 
ality changes may occur. 

It appears that although seven of the eight 
retarded readers in this study did show emo- 
tional disturbances, and although there were 
common personality characteristics in these 
children, there were enough differences among 
them to preclude any broad classification of 
types of maladjustment. This finding tends 
to support the conclusion of other studies that 
there is no single type of personality malad- 
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justment in the retarded reader. It likewise 
lends weight to the hypothesis that inability to 
read may be connected with the attitudinal 
system or the self-concept of the individual. 

The causes of the changes in reading abili- 
ties recorded in this study are in need of fur- 
ther investigation. It has been suggested that 
the change which occurs in the child as a re- 
sult of a nondirective play therapy experience 
is a change in self-concept. More adequate 
measures of self-concept must be devised in 
order to test the hypothesis that the change in 
reading ability of some retarded readers fol- 
lowing a nondirective play therapy experience 
is related to the change in the child’s self-con- 
cept. Experimentation is also needed to deter- 
mine how the therapy experience enables the 
child to change and what techniques are most 
valuable in facilitating this change. 

This study shows that the change in reading 
ability which followed the nondirective play 
therapy experience was present six weeks after 
the therapy had ended. There is a definite need 
to determine if this changed ability is perman- 
ent or if it disappears after a time. 

It cannot be concluded from this study that 
the child who is having difficulty with his 
reading is also having difficulty with his ad- 
justment problems. Before this conclusion 
could be approached there would have to be a 
more thorough demonstration that the emo- 
tional maladjustment which the children in 
this study evidenced was not an accidental 
characteristic of the sample which was chosen 
for the study. Even though it might be found 
that emotional maladjustment does not exist 
in retarded readers to the extent found in the 
sample which was investigated in this study, 
the subject may not be dismissed lightly. 

Schools have attempted to prevent emotional 
maladjustment from developing and have 
worked toward eliminating maladjustment 
that does occur. This is probably as it should 
be, and what is needed is not a greater desire 
on the part of the school but better tools with 
which to work in preventing and correcting 
emotional maladjustment. 

This study and many. other studies concern- 
ed with nondirective therapy have shown that 
personal changes do result from a nondirec- 


tive play therapy experience. Axline [1] has 
shown that the procedures of nondirective 
therapy can be used in the classroom. If per- 
sonal changes do result from nondirective 
therapy and if these techniques are adapt- 
able to the classroom, then an appizach to 
teaching is indicated which would inlude a 
corrective mental hygiene aspect. 


SUMMARY 


This study was an investigation of the ef- 
fects of individual and group play therapy on 
the reading level of retarded readers. Eight 
retarded readers were selected for the play 
therapy experience. The criterion of reading 
retardation was a negative discrepancy be- 
tween mental age and reading age. ‘The study 
was designed to include three periods of thir- 
ty school days each. The first period was a 
control period, which was intended to measure 
the gains of the children during a period in 
which no play experience was given. During 
the second period, the therapy period, the chil- 
dren were given a play therapy experience of a 
nondirective nature. The third period was in- 
cluded to measure the gains which followed 
immediately after therapy. A measure of intel- 
ligence was obtained during the control period 
and measures of silent and oral reading abili- 
ties were made before each of the three periods 
and following the third period. 

As a result of the play therapy experience 
it was concluded: (1) significant changes in 
reading ability occurred as a result of the play 
therapy experience, (2) personal changes may 
occur in nondirective play therapy in as little 
as six individual and three group play therapy 
sessions, and (3) there appears to be no com- 
mon personality maladjustment present in this 
group of retarded readers. 


Received July 1, 1949. 
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HE most widely used individual tests giv- 
ing measures of adult verbal ability are 
the 1937 revision of the Stanford-Binet [12], 
and the Wechsler-Bellevue [15]. The Wechs- 
ler vocabulary subtest has no separate stand- 
ardization but the standard score can be used 
to obtain an estimate of verbal ability. The 
Stanford-Binet items are mostly verbal at the 
higher levels. The Stanford-Binet has often 
been criticized for requiring verbal production 
facility and for lack of adequate norms for 
adults. 

Numerous shorter tests of verbal ability at 
the adult level have been suggested. ‘Thorndike 
[13] worked out the MA value of scores on 
various sets of vocabulary items derived from 
the revised Stanford-Binet vocabulary test for 
use by the American Institute of Public Opin- 
ion. A 15-word test was estimated to have a 
reliability of about .90. Short forms of the 
Wechsler-Bellevue using mainly verbal items 
have also been worked out by Geil [8], Gur- 
vitz [9], and Rabin [11]. Kent’s [10] EGY 
test consists of only 10 items, and is easy to 
give, but norms adequate for general purposes 
are not available, and the test’s reliability and 
validity have not been evaluated for a repre- 
sentative group of adults. 

In spite of the common recognition of the 
excellence of vocabulary tests and the many 

1Acknowledgment is due Mr. Neil W. Coppin- 
ger, Mrs. Helen S. Ammons, and Mr. Allyn F. M. 
Munger of Tulane University for reading the man- 
uscript critically and offering many helpful sugges- 
tions. The test and manual with final scale norms, 


answer sheets, and instructions for administration 
[1] can be obtained from R. B. Ammons. 


tests developed, no completely adequate test is 
available at present for general clinical pur- 
poses. Such a test would take only a short time 
to administer, would be satisfactorily reliable 
and valid for individual use, and would not 
require extensive verbal expression by the client. 
Its scores could be expected to show relatively 
little age deterioration [15], would a priori 
represent the developmental stages of social 
concepts, and would probably be highly cor- 
related with scores on other more general tests 
of verbal ability. 

With these considerations in mind, a search 
was made for the most promising testing meth- 
od. The multiple-choice procedure with pic- 
tures as used by Van Alstyne [14] seemed 
best. Ammons and Huth [4] tried out this 
method, and found it excellent in practice. A 
program was set up to obtain and standardize 
a vocabulary test for a wide range of ability 


levels and several population groups [2, 3, 5, 
6, 7). 


PROBLEM 


The purpose of this study was to construct 
and evaluate an adult-level recognition-type 
vocabulary test, using the 16 4-picture plates 
developed by Ammons and Huth [4]. The 
following steps were taken: (a) discovery of 
suitable preliminary items, (b) testing of a 
representative adult population, (c) item se- 
lection, (d) division of items into two equiva- 
lent forms, (e) calculation of norms, and (f) 
evaluation of the forms as to reliability and 
validity. This study, with two previous papers 
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[3, 7] in the series, provides a complete set of 
norms for the general white population of the 
United States for the Full-Range Picture Vo- 
cabulary Test. 


PROCEDURE 

Materials. The words of the Wechsler vo- 
cabulary subtest [15] were mimeographed so 
that the answers could be recorded. Ammons 
and Huth [4] had tried out a number of words 
with their 16 plates and the group [2, 3, 5, 
6, 7] collected more. Of the total of 291 thus 
available, 43 were eliminated by group dis- 
cussion, and the remaining 248 were adminis- 
tered to 20 college students, representing 5 
levels of ability, 4 students each with Wechs- 
ler 1Q’s of 99 to 109, 112 to 119, 121 to 129, 
131 to 138, and 140 to 144. 

Correct responses were tabulated by ability 
level for each word, and analysis made. The 
preliminary item selection for levels below av- 
erage adult was based on school- and pre- 
school-age children, whose testing is described 
elsewhere [3, 7]. After this preliminary test- 
ing and item selection a total of 226 words 
remained. These were arranged by plate in 
estimated order of difficulty, based on the 50 
per cent passing point calculated from a mov- 
ing averaze including five levels at a time. Of 
the items, 56 were potentially of a difficulty 
satisfactory for testing persons of average 
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adult or greater ability. A more detailed de- 
scription of the selection of the 226 stand- 
ardization items can be found 


and Rachiele [6]. 


Subjects. The standardization sample chos- 
en consisted of 120 persons, ages 18 to 34 in- 
clusive, equally divided as to sex, and limited 
to members of the white race living in and 
near Denver. The average age and standard 
deviation of ages for the female subjects were 
24.6 and 2.4 years, and for the males 25.3 and 
2.5 years respectively. 


in Ammons 


Persons were tested from occupational cate- 
gories in numbers proportional to the distri- 
bution of occupations in the white adult popu- 
lation as a whole, ages 18 to 34 inclusive, as 
indicated in the 1940 census [16]. The pres- 
ent sample of 60 adult males was composed of 
persons employed, or immediately employable, 
in specific occupations, and is therefore repre- 
sentative of about 75 per cent of the white 
males in the same age range in the 1940 popu- 
lation. The female sample was composed of 
employed women and housewives, and there- 
fore represents about 85 per cent of the fe- 
males in the same age range in the 1940 popu- 
lation. Housewives were chosen so that the 
occupations of their husbands were representa- 
tive of those of the employed white male popu- 
lation, 18 to 34. 


TABLE 1 
EMPLOYMENT STATUS OF Wuirte Aputts, Aces 18 To 34 INcLUsIve, IN AND OUT OF THE LABOR 








‘ Males 


Force, 1940, Unirep STATES, AND THAT OF PRESENT SAMPLE 


Females 
Occupational Census Sample Census Sample 
Groups* per cent} N per cent} N 
Professional and semiprofessional.........._.... 5.9 a 5.3 3 
Farmers and farm managers....................-...... 8.8 5 | — 
Proprietors, managers and 
officials (except farm) ....................-.-.--.-.----- 6.2 4 7 1 
| ee er ee i 16.6 10 14.3 9 
a ae ae ee 12.7 8 3 _ 
tain neaicanimnenienidiicbnabane niin : 24.4 15 8.1 5 
SN cicspnnicninenciceidacinnctantignailitiinunee 6.1 3 7.3 4 
Farm laborers and foremen......................--.--- 10.2 6 4 -- 
Laborers, except farm and mine...................... 9.1 5 A _ 
Own home housework.............--.--...---------------- = -- 63.1 38 
pA cet ol 5h ae REE set 100.0 60 100.0 60 








*Occupational categories utilized in the 1940 census reports [16]. 


tTaken from census reports [l6a, 88, 98, 


104; 16b, 17; and 16c, 65]. 
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Table 1 shows the breakdown of the sam- 
ple by occupational categories. Subjects were 
obtained for testing in several manufacturing 
companies, a sanitarium (employees only), in 
parks, in a veterans hospital (surgical cases), 
in an opportunity school, on farms near Den- 
ver, through several churches, and by canvass- 
ing several city areas for housewives. 

Testing. The Wechsler-Bellevue vocabulary 
subtest was administered first, using standard 
procedure [15]. All responses were recorded 
verbatim for later careful scoring. On com- 
pletion of this test, Plate I of the picture vo- 
cabulary test was brought out, and the pro- 
cedure explained and demonstrated [1], using 
simple words from the preschool level. Plates 
were presented one at a time, in order from 
1 to 16. With each plate, testing was started 
with words at the subject’s estimated level of 
ability, then continued up until three consecu- 
tive items had been failed and down until three 
consecutive items had been passed. All items 
below this level were considered passed, and 
were credited to the score. All testers used the 
same procedures in testing; the instructions 
had been mimeographed to ensure uniformity. 


RESULTS 

The first step in analysis consisted of the 
scoring of the Wechsler vocabulary test for 
each subject. These test scores were then ar- 
ranged in order of their magnitude from low- 
est (714 raw score points) to highest (40 
points), and the subjects were divided into 
six groups on the basis of their scores. The 
subjects with the 20 lowest scores formed 
Group 1, the subjects with the 20 next low- 
est scores became Group 2, etc., establishing 
successive levels of ability in the sample. The 
mean Wechsler vocabulary raw scores for the 
six groups thus set up were 14.1, 19.0, 22.9, 
25.2, 27.0 and 30.8, respectively. 

The next step was to analyze the picture 
vocabulary responses of all subjects. Each 
subject was credited with passes on all words 
falling at levels below his three successive pas- 
ses, and was charged with failures on all words 
falling at levels above his three successive 
failures. Then the percentage of correct re- 
sponses by the members of each of the six 
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groups was computed for each word. An index 
number for each word was calculated by lo- 
cating its 50 per cent passing point, interpolat- 
ing if the point fell between two groups. Thus, 
an index number of 2.5 meant that a word 
was passed by less than 50 per cent of the 
subjects in Group 2 but by more than 50 per 
cent of the subjects in Group 3, and would 
presumably be passed by exactly 50 per cent of 
a group lying halfway between Groups 2 and 
ability. The relative difficulty of each 
then, indicated by the size of its 


a 
> In 


word w 
group placement index number. 

174 of the 226 tentative words were includ- 
the analysis. Of this total, 98 
found to fall below the adult range of diff- 
culty; that is, they were passed by more than 
50 per cent of the subjects in Group 1. Of 
the remaining 76 words, 66 had a 50 per cent 


ed in were 


passing point falling somewhere within the 
ability of our adult sample (index 
from 1.0 to 6.0). The other 10 were 
passed by less than 50 per cent of the subjects, 


range ofr 


numbers 


even in Group 6, and in order to compute in- 


hem, an imaginary Group 7 


dex numbers for t 
with 100 per cent correct responses to these 
words 


words rec 


was assumed. In this way, these 10 
eived index numbers falling between 
6.0 and 7.0. 

The increases in per cent of correct respon- 
ses, from group to group, were carefully ex- 
amined for the 76 adult-level words, and a 
number of words were eliminated because of 
their relatively poor discrimination between 
successive levels. Ten words were selected for 
a chi-square test of significance of the differ- 
ence between the numbers of males and fe- 
males giving correct responses. In no case was 
the difference found to be significant at even 
the 10 per cent level of confidence. 

The final selection of words appropriate for 
preschool-age and school children, ages 2 to 17 
inclusive, was left up to the individuals re- 
sponsible for that portion of the standardiza- 
tion [3, 6, 7], although no items showing poor 
discrimination with adult subjects were re- 
tained by them. Conversely no item was re- 
tained at the adult level which showed poor 
discrimination with the school-age subjects. 


Adult level 3 was considered equivalent to CA 
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17 for school children, and adult levels 1 and 
2 were not utilized. The adult-level items re- 
tained in the final selection had group place- 
ment index numbers of 3.0 or above in all but 
two cases. 

Shortages developed at group levels 3 and 5. 
Such shortages were met by borrowing words 
from adjacent levels, as is indicated in the 
following listing of items for adult testing 
which gives eight items at each point level, 
four in each form. Levels 1 to 16 are indicated 
in the full test [1] in terms of approximate 
chronological age at the 50 per cent passing 
point and the adult levels are given in terms 
of the group placement index numbers previ- 
ously described. Forms A and B of the full 
test were constructed by assigning the words 
in successive pairs to them, 85 items to a form. 

The following items were selected for the 
adult levels: 

Form A; Plate 1: egress (A6.3) ; Plate 2: revelry 
(A4.0), ebullience (A6.4); Plate 3: replenishment 
(A3.1), retaliation (A4.1); Plate 4: none; Plate 5: 
none; Plate 6: antiquated (A3.8); Plate 7: none; 


Plate 8: dehydration (A4.3); Plate 9: agrarian 
(A6.2):; Plate 10: none; Plate 11: none; Plate 12: 
mastication (A2.6), itinerant (A4.5), coercion 
(A4.6), corpulence (A5.5), insatiable (A5.6) ; 
Plate 13: deleterious (A6.2); Plate 14: none; 
Plate 15: displacement (A5.0), perusing (A5.0 


Plate 16: none. 


Form B; Plate 1: translucent (A2.5), depreda- 
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tion (A4.0); Plate 2: terpsichorean (A6.0); Plate 
3: aggressiveness (A3.6); Plate 4: domicile (A4.0) ; 
Plate 5: munificence (A5.7); Plate 6: none; Plate 
7: discourse (A4.5); Plate 8: none; Plate 9: pe- 


cuniary (A4.9); Plate 10: tonsorial (A4.4); Plate 
11: none; Plate 12: gourmand (A4.6), repast 
(A5.2), mendicant (A6.3); Plate 13: lacrimation 


(A6.3); Plate 14: constabulary (A3.2), fortuitous 


(A6.4); Plate 15: supine (A5.5); Plate 16: none. 


Scores for each subject were obtained in 
terms of the number of correct responses out 
of a maximum possible of 85 on each form. 
‘Table 2 presents a comparison of the ranges, 
medians and standard 


means, deviations for 


various distributions. All are negatively skew- 


ed to a moderate extent, as evidenced by the 
consistently higher values of the medians as 


compared with the means 


The fact that the Wechsler verbal scale IQ 
estimated from the group raw score mean of 
23.4 is 104 suggests that the average intelli- 
gence of the present sample approximates 
closely that of the population used by Wechs- 
ler. According to Wechsler’s table of weighted 
scores, the standard deviation on the vocabu- 
lary test is about 7 raw score points, which 
is slightly larger than our figure of 5.8. 

Smoothed raw score equivalents of selected 
percentiles from 1 to 99, for males and females 


A and Form B, 


and combined groups, Form 


TABLE 2 


COMPARISON OF RANGE, MEAN, MEDIAN, AND STANDARD DEVIATION 
WECHSLER-BELLEVUE VOCABULARY TEST SCORES AND FULL-RANGE 


FOR FREQUENCY DISTRIBUTIONS OF 


PicTuURE VOCABULARY 


Test Scores FOR THE PresENT Wuitre ADULT STANDARDIZATION GROUP 


N 
Wechsler-Bellevue 
Vocabulary Test 
| a eee . a 60 
SRR eerie ; 60 
PY UD ater nctinies 120 
Full-Range Picture 
Vocabulary Test (Form A) 
REE SS ee aeeeeeeer oenaee 60 
EER ‘sancti 60 
fk ee ee 120 
Full-Range Picture 
Vocabulary Test (Form B) 
RE CEE ARs. ore oh eee em 60 
EEE ST aN a 60 
Se eee ee 


120 





Range Mean Median SD 
8.5-40.0 23.4 24.5 6.2 
7.5-35.5 2 24.5 5.3 
7.5-40.0 23.4 24.5 5.8 
49-84 69.6 70.2 9.3 
25-83 68.4 68.5 94 
25-84 69.0 69.4 94 
47-84 69.2 70.5 8.8 
45-85 70.2 70.5 8.4 
45-85 69.7 70.5 


8.7 
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TABLE 3 
ESTIMATED PICTURE VOCABULARY RAW SCORES FOR 
SELECTED PERCENTILES FOR THE PRESENT 
Wuire ADULT STANDARDIZATION 
Group 
(Ace RANGE 18 To 34 INCLUSIVE) 


Raw score equivalents 
Percent- 


Form A Form B 

iles Males Females Total Males Females Total 
99 85 84 84 84 85 84 
95 83 81 82 82 83 82 
90 82 79 80 81 81 81 
80 80 77 78 78 7 78 
70 76 75 75 75 7 75 
60 74 72 73 73 73 73 
50 71 69 70 71 7 71 
40 68 7 67 68 69 68 
30 65 65 65 65 66 65 
20 60 61 60 62 63 62 
10 56 58 57 57 59 58 

5 54 56 55 54 57 55 

1 51 53 52 49 54 52 


are given in Table 3. These can be referred to 
directly in the interpretation of scores earned 
by subjects who are from 18 to 34 years of age. 
For older subjects, obtained raw scores can be 
corrected as indicated in Table 4 before the 
percentiles of Table 3 are used. The correction 
values in Table 4, ranging from 2 points at 
age range 35-39 to 9 at age range 53-59 were 
calculated on the assumption that the mean 
picture vocabulary test scores would decline 


TABLE 4 
EXTRAPOLATED VALUES, FoR AGEs 35-59, OF FULL- 
RANGE Picture VocABULARY ‘TEST MEAN 


Scores, Forms A AND B, WituH ESTIMATED 
CORRECTIONS FOR AGE DECLINE 


Correc- 
Wechsler- Full-range tion 
Bellevue Picture both 
verbal scale Vocabulary Test forms 


Age weighted mean scores (round- 
range scores* (extrapolated) tf ed) 

? Form A Form B 

17-34 47.0 68.8t 69.6f 0 
35-39 45.5 66.4 67.3 2 
40-44 44.5 65.0 65.9 4 
45-49 43.5 63.5 64.4 5 
50-54 42.0 61.3 62.2 7 
55-59 41.0 59.9 60.7 9 


*Corresponding to IQ of 100 [15, pp. 244-46]. 


+Calculated as in following example: 


68.8 
Age range 35-39 = 1.46 1.46 X 45.5 66.4 
47.0 
69.6 
—== 1,48 1.48 X 45.5 = 67.3 
47.0 


tActual, obtained in standardization. 
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with increasing age in proportion to the decline 
in the verbal scale weighted scores actually 
noted in the standardization of the Wechsler- 
Bellevue. 
older, the appropriate correction value is to be 
added to the raw score obtained, to give a cor- 


For subjects 35 years of age and 


rected raw score, for which the corresponding 
percentile score can then be obtained from 
‘Table 3. 

The Pearson product-moment coefficient of 
correlation between the two forms of the test, 
based on the scores of the entire sample of 120 
subjects, is .93. The authors are inclined to 
agree with Wechsler’s opinion that the valid- 
ity of a test can be determined only through 
clinical experience. However, in order to ob- 
tain an initial estimate of the “validity” of the 
Full-Range Picture Vocabulary Test, product- 
moment coeficients between the 
scores of the 120 subjects on each form of the 
test and their Wechsler vocabulary test scores 
were computed. They are, for Form A, + .86, 
and for Form B, + .85. These are relatively 
high values in comparison with most “validity” 
coefficients. Indeed, 


correlation 


they seem impressively 
high when it is remembered that the eta cor- 
relation between the Wechsler-Bellevue vocab- 
ulary test and the full scale is +.85. 


DISCUSSION 


The picture vocabulary test actually tests 
verbal comprehension, as well as vocabulary. 
Its nature makes the measure of vocabulary 
dependent upon the recognition difficulty of 
the pictures. The present items are not hard 
enough adequately to test the upper ranges of 
adult ability, those beyond IQ 125 on the 
Wechsler. 

The method of standardization has been pre- 
sented in detail to allow a further discussion 
of standardization procedures. It would seem 
that what is desirable in a standardization is 
that it provide performance information from a 
group representing a sufficiently large propor- 
tion of a population to allow meaningful com- 
parisons of individual scores with it. A norm- 
ative sample is not “representative,” but is 
“representative of some specific population.” 
The standardization group reported in this 
study is representative of white employed 


Widltiet 
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males of ages 18 to 34, who constitute about 
75 per cent of the total white adult male 
population in this age range. The female 
sample represents about 85 per cent of the 
white female population of ages 13 to 34, ex- 
cluding only those who are not employed or 
not housewives. 

The test was described as interesting by 
most of the adults tested, and was relatively 
easy to administer, taking from 10 to 15 min- 
utes. These findings seem to hold at other age 
levels as well [3, 7], and mark the test as a 
usable clinical instrument. 


SUMMARY AND CONCLUSIONS 


A preliminary set of 226 items was admin- 
istered to 60 male and 60 female adults with 
the 16 Full-Range Picture Vocabulary Test 
plates. These subjects were 18 to 34 years old, 
white, and employed or housewives. The sam- 
ple was representative with respect to age, 
sex, and occupation of approximately 80 per- 
cent of the white population of the United 
States within this age range. Words were se- 
lected for two final adult forms, and norms 
presented for this group. The two forms cor- 
related .93 with each other, and .85 and .86 
with raw scores from the Wechsler vocabulary 
test. In view of its short administration time, 
intrinsic interest value for adults, and excellent 
reliability and validity, the test should prove to 
be highly useful for such purposes as estimating 
the intellectual capacity of verbally hand- 
icapped adults and rapid screening where 
maximum efficiency is desired. 

Received June 24, 1949. 
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HE present investigation is concerned 

with the hypothesis that the client's at- 
titudinal set at the beginning of therapy can 
be observed with sufficient reliability to serve 
as a variable in therapy research. The concept 
of “therapy readiness” is used to designate 
this initial set. Such a proposal seems to be 
part of a general movement attempting to 
make explicit the techniques and attitudes in- 
volved in therapy. 

Researchers of the Nondirective group 
laid the foundations for objective therapy-an- 
alysis. Porter [2] and Snyder [5] concentrat- 
ed on showing that therapists’ and patients’ 
individual responses could be indentified with 
regard to specific technique classifications. 
These pioneer studies demonstrated a promis- 
ing degree of reliability in the treatment of 
transcribed interview data. A more recent em- 
phasis has been the attempt to objectify broader 
categories dealing with counselor and client 
attitudes, feelings, and points of view. Exam- 
ples of this current trend are the treatment of 
insight by Curran [1] and Raimy’s [3] work 
with the client’s changing attitudes toward the 
self. In addition, reports have appeared of 
studies which emphasize the counselor’s atti- 
tudes toward therapy and the modifiability of 
these attitudes with training [4]. 

In order to test the reliability of judging 
“therapy readiness,” recorded first interviews 
of nine cases were independently ranked by 
each author as to the amount of “therapy 
readiness” demonstrated. These verbatim first 
interviews were obtained from material re- 
leased in mimeographed form by the Univer- 
sity of Chicago Counseling Center. No at- 
tempt was made to represent the total range of 
initial therapy attitudes. The cases were se- 
lected only on the bases of their being available 


and their being unknown to the rankers. 


One of the authors devised a rating proced- 
ure to aid in clarifying the “therapy readiness” 
concept for her independent rankings. She first 
gave six “readiness” ratings to each of the in- 
terviews on a rough five-point scale with re- 
gard to the following aspects. (1) How easily 
does the client verbalize during the hour? (2) 
To what degree can the client express feelings 
rather than unemotional verbalizations? (3) 
What ability does he have to express and deal 
with “‘real’’ problems? (4) What is the sub- 
ject’s aim in the therapy? Is it to solve a specific 
problem, reorganize things in general, or to 
finish growing up? (5) What amount of work 
does the client assume he is going to do in 
proportion to the contribution of the therapist ? 
(6) How much present anxiety exists? Is this 
anxiety seen by the client as related to himself, 
to the external situation, or to both? Some 
current disturbance seems conducive to “readi- 
ness,” but one would not necessarily assume a 
linear relationship between amount of anxiety 
and amount of “readiness.” The amount of 
anxiety was considered in relation to the source 
as viewed by the client. The final rankings 
were derived from a consideration of these six 
“readiness” ratings. 

The other author did not use, nor was he 
aware of, this rating procedure. His rankings 
were based solely on the answer to the ques- 
tion: which is the most ready for therapy? Yet, 
the rank-order correlation between the two 
rankings was .92. In spite of the small number 
of cases used, this correlation strongly suggests 
that the attitudinal set of clients, as demon- 
strated early in the therapy relationship, can be 
reliably observed by trained people having a 
similar concept of “therapy readiness.” 
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If such a therapy attitude can be consistently 
judged, investigations of the following hy- 
potheses seem possible. The client’s attitudinal 
set is (1) a function of a stable personality 
factor, or (2) a function of a specific period 
in personality development. There are the fur- 
ther possibilities that these attitudes are (3) 
predominantly products of the psychological 
atmosphere in which the individual finds him- 
self, or (4) relatively consistent personality 
variables but manipulatable by the use of ap- 
propriate attitude changing techniques. It is 
hoped that this introductory discussion will 
stimulate further interest in formulating and 
exploring the significance of the client’s atti- 
tudes toward psychotherapy. 

Received December 22, 1949. 
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Books 


Baskin, B. P. Pavlov, a biography. Chicago: Univ. 
of Chicago Press, 1949. Pp. xiii + 365. $6.00 


Although this is not great biographical writing, 
it is sincere, conscientious, and well worth the read- 
er’s time. The first half of the book, dealing with 
the facts of Pavlov’s life, is somewhat literal report- 
ing marked by the superficiality of its interpretation 
and integration, and marred by the author’s tendency 
toward hero worship. Yet its very faults lend it a 
definite charm. The second half of the book traces 
the development of Pavlov’s experimental work and 
its theoretical interpretation. For psychologists, the 
section on conditioned reflexes will be disturbing in 
its psychological naivete. The author states that a 
“majority of psychologists” believe that conscious- 
ness causes conditioned reactions, and apparently re- 
lies on Hilgard and Marquis’ text for his knowledge 
of the current literature. Nevertheless, this is an 
interesting and worthwhile volume. It should re- 
mind us how “Pavlov” has become stylized and in- 
stitutionalized in the writings of his borrowers. As 
Lashley is quoted by the author”... . it seems to 
me that these characteristics of his work which gave 
him greatest claim to genius have been least influ- 
ential in American psychology. It has been his mis- 
fortune to fall into the hands of philosophers. His 
influence has been rather that of a Descartes than 
of a Pasteur.”—W.A.H. 


Beck, Lester F. Human growth. New York: Har- 
court, Brace, 1949. Pp. 124. $2.00. 


Based on the motion picture film of the same 
title, this book is an admirable example of visual 
education in itself. It is a simple but dignified 
presentation of conception, gestation and birth, for 
the sex education of teen-age children. The text 
and illustrations were pretested. It is the type of 
resource for which psychologists and other coun- 
selors often have need, in dealing with some prob- 
lems of children and parents. 





Nore: Some reviews in this issue were prepared 
by the Associate Editors, who may be identified by 
their initials. Unsigned reviews are by the Editor. 


—L, F. S. 


BeLtows, Rocer M. anp Rusu, Cart H., Jr. Work- 
book in personnel methods. Dubuque, Iowa: W. 
C. Brown Co., 1949. Pp. v + 102. $2.10. 


A workbook to accompany the senior author’s 
Psychology of personnel in business and industry. 
The twenty-two real job-like exercises provide in- 
centives and practice in the major aspects of per- 
sonnel work. 


BinGeR, CARL. More about psychiatry. Chicago: 
Univ. of Chicago Press, 1949. pp. xiii + 201. 
$4.00. 


In this small volume, fourteen essays which were 
originally chiefly articles or addresses prepared for 
diverse audiences have been gathered together, de- 
signed apparently for medical students and prac- 
titioners and interested laymen. They discuss psy- 
chosomatic medicine and its relation to psychiatry 
and such aspects of modern psychiatry as the so- 
cial function of psychiatrists and the inadequacies 
in their training and numbers. There is little that 
is novel to the psychologist, but the author’s frank 
assessment of the limitations of psychiatric 
knowledge, particularly in such matters as the re- 
lation between character and psychosomatic dis- 
ease are worth noting. The essay “What is Men- 
tal Health” discusses the problem of the “normal” 
effectively and sensibly. Of clinical psychologists, 
Binger states . . « No up-to-date psychiatrist 
would want to get along without his help .... the 
position of the clinical psychologist in a psychi- 
atric department is like that of a surgical patholo- 
gist in a department of surgery But he 
should first and foremost be encouraged to pursue 
his own researches. Too many gifted psychologists 
are now being forced into the position of therapists, 
where through lack of medical training they often 
remain frustrated, instead of using their energies 
and skills to push back the walls of our ignorance 
with methods of relative precision.”—A. R. 


DASHIELL, JOHN FrepericK. Fundamentals of gen- 
eral psychology. (3rd Ed.) Boston: Houghton 
Mifflin Co., 1949. Pp. x + 690. $4.00. 


In his third edition, Dashiell again presents a 
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clear and comprehensive account of the field of 
psychology. The increased bulk of experimental 
material, much of it presented in an elementary 
textbook for the first time, is offset by an easier 
style of writing that makes quite complicated is- 
sues seem understandable. New stress is given to 
the social and clinical aspects of psychology. The 
extensively revised chapters on motivation, adjust- 
ment, emotion, and personality show the impact of 
recent thought and research, and provide a sound 
foundation for students who will later specialize in 
clinical psychology, as well as a sympathetic orien- 
tation for those who will not. 


GLover, Epwarp. Psycho-analysis: A handbook for 
medical practitioners and students of compara- 
tive psychology. (2nd. Ed.) New York: Staples 
Press, 1949. Pp. 367. $4.00. 


This is the second and expanded edition of a 
text first published in 1939. The expansion has 
been chiefly in the statement of principles since the 
author feels that little in the way of psychoana- 
lytic research has been done in the last ten years. 
The book was written for the general medical 
practitioner. It contains two major sections of ap- 
proximately equal length covering theory and 
clinical applications, which are followed by some 
discussion of practical aspects, cost, duration, se- 
lection of analyst, etc. His statement of theory is 
comprehensive, extensive, well organized and lucid, 
but without any serious attempt to adduce evi- 
dence. The section on clinical psychoanalysis in- 
cludes discussion of psychoneurosis, psychoses, 
transitional groups, psychosexual disorders, social 
difficulties (including character neuroses) and the 
psychoanalysis of children. His treatment is thor- 
ough throughout and reasonably orthodox. He 
makes a consistent effort to develop a categoriza- 
tion of disorders on a stated rationale although 
he notes the difficulties here.—A. R. 


GOODENOUGH, FLORENCE L. Mental testing, its his- 
tory, principles, and applications. New York: 
Rinehart, 1949. Pp. xix + 609. $5.00. 


The most refreshing book yet published in its 
field, and easily the best. Long-needed broad cov- 
erage of the history, principles and methods, in- 
struments and applications. Sound and scholarly 
treatment of basic issues in theory and practice. 
Judicious and authoritative exposition of statisti- 
cal desirabilities. Excellent formluation, exem- 
plary style, straightforward content, imaginative 
outlook, and (our compliments to the publisher) 
beautiful book-manufacture design. A worthy cap- 
stone to the brilliant career of a meticulously con- 
scientious pioneer in scientific and professional 
psychology. (And this reviewer is seldom intem- 
perate!) — E. A. D. 


Guttrorp, J. P., ann Micnaet, Wiitiam B. The 
prediction of categories from measurements: 
with applications to personnel selection and clini- 
cal prognosis. Beverly Hills., Calif.: Sheridan 
Supply Co., 1949. Pp. v + 55. $1.40. 


A statistical monograph, on the problem of pre- 
dicting the dichotomous classification of persons 
from tests of known validity. Previously applied 
in the main to the prediction of “succeed-fail” in 
training or in industry, the methods have interest- 
ing possibilities in clinical research involving the 
prediction of “well-sick,” “improve-regress,” and 
the like. There are nomographs for predicting the 
per cent selected by the measucing device. 


Hess, D. O. The organization of behavior. New 
York: Wiley, 1949. Pp. xix + 335. $4.00. 


It has been fashionable for recent psychological 
theories to ignore the neural correlates of behavior, 
both because of their complexity and because they 
have seemed to contribute no integrating concepts 
to the theory. Reversing the trend, Hebb has pre- 
sented a physiologically oriented system of psy- 
chology. He brings to the attention of psycholo- 
gists some recent understandings of the physiology 
of the nervous system known only to a few of us, 
and uses them to clarify a number of the most 
perplexing problems of psychology. Some previous 
theories have dealt adequately with perceptual 
generalization (Gestalt), and others with the per- 
manence of learning (Hull, for example), but no 
theory has given an entirely satisfactory descrip- 
tion of both. Problems of attention and set have 
offered real difficulty to all systems. Hebb makes 
these three issues the core of his theory, and 
emerges with a novel and comprehensive formula- 
tion. Clinical psychologists will find much to inter- 
est them. The well-known lack of intellectual deficit 
following considerable ablations of the cerebrum 
is handled well by Hebb’s theory, and his discus- 
sion of motivation is provocative. His discussions 
of emotion and of mental hygiene are less satis- 
fying, suggesting that further amplification of the 
theory will be needed in these areas. The volume 
is somewhat variable in its effectiveness of com- 
munication. Some parts are read easily, while oth- 
ers are best attempted with a good book on physi- 
ological psychology at your elbow for ready ref- 
erence. But it is an important book, and worth 
the effort. 


Law, Stantey G. Therapy through interview. 
New York: McGraw-Hill, 1948. Pp. xiii + 313. 
$4.50. 


This small, clear, spritely book was intended to 
give physicians some understanding of the use of 
psychotherapy in their general practice. In the 








160 NEW BOOKS 


main, it teaches by example. Thirty-one of the 
thirty-four chapters contain the “verbatim” reports 
of six composite or synthetic cases of the sorts that 
physicians meet. The therapy isn’t deep or ex- 
tended, but it is well based in therapeutic princi- 
ples. The author recognizes the necessity of the 
self-responsibility of his patients, and shows his 
real understanding and 
lors, and students in training in 
can learn much from the book. 


permissiveness. Counse- 


psychotherapy, 


SARASON, SEYMOUR B. Psychological problems in 
mental deficiency. New York: Harper, 1949. Pp. 
x + 366. $5.00. 

A stimulating and critical survey of present psy- 
chological and the field of 
mental deficiency. In its practical aspects, the 
makes 


theories practices in 


book 
excellent use of and 


case-study material, 


gives special attention to the use of psychotherapy} 
mentally deficient. 
field, 


has 


of the needs of the 
the 
He 


job of clearing away the debris of inadequate 


to meet some 


In treating research in Sarason’s ap- 


proach is mainly critical. done an excel- 
lent 
past studies, but has made no attempt to build a 
new One leaves the book with a feeling, 


the 


structure. 


Where next?” In S} ite of preponderance yf 


critical material, the book has clear uses. It will 
interest and arouse all who deal with the prob- 
lems of the mentally deficient, and will prove a 


textbook, especially for students consider- 


areas of research.—B.M.L. 


ree WE r i y A ) } / »/i > sete 
SEIDENFELD, Morron A. Psychological aspects of 


ringfield, Ill.: Charles C Thom- 


A monograph on the psychological problems oi 


sick people, intended mainly for the indoctrinatioa 


of physicians and ancillary medical workers. The 
author demonstrates his familiarity with the ad- 


justive difficulties of persons suffering from both 


nondeforming and deforming diseases, and com- 
municates in an easy, persuasive style, with a num- 


ber of case illustrations. 


Varieties oO; di lin- 
Pp. 


al. 
Harper, 1949. 


SHELDON, WILLIAM H. et 


quent youth. New York: xvii 


899. $8.00. 


The empirical portion of this volume reports 
200 selected biographies of boys from a rehabilita- 
tion home, accompanied by the photographs from 
which derived. The biographies 
include the individual’s delinquency,. family back- 
ground, mental and medical history, a summary 
of his activities and achievements, an index of de- 
linquency and the prognosis. The cases are divid- 
ed into groups indicating insufficiencies (mental 
and medical, N 100) “psychopathy” (mainly 


somatypes were 


AND 


TESTS 


psychoneurotic), alcoholics, gynandophrenes (femi- 


nine males) and “criminals.” Central tendencies 
are given on the entire group in respect to soma- 
types (endomorphic mesomorphs predominating), 


psychiatric 


tal backg 


index, physical indications and paren- 


round. There are over 30 figures of 


somatype distributions among which are: the vari- 
ous yroups of delinquency, psychotic patients 
with different dominant components, a collegiate 





and aviation cadet population. The book has no 
bibliography. The results are Mt critically related 
e broader clic f literature ¢ t! 
pi ble ( sidere 1 — rat g. di l ( i 
constiti | types. Highly controversial concepts 
are used 1 basis for rating, such as human 
thorouch-bredness, but the reader is not given a 
detailed objective basis for these judgments nor 
how tl difficulties that must hay Y red 
obtaining them. The study could not be repeated 
with the information given in this report. If Shel- 
dor cognizant of the findings showing the effect 
of culture and environment on human behavior ! 
fails to indicate it. He prejudices his empirical 
results the mind of the objective, careful and 
informed scientist by his emotionally charged lan- 
pees ive generalizations and his ap- 
nare ( cality about his own findings. — 
F. M 
StuiT, Dewey B., Dickson, GWENDOLEN S., Jor- 
DAN, I MAS F., AND SCHLOERB, LesTER. Pre- 
dicting success in professional schools. Wash- 
i American Council on Education, 1949. 
p ee 
A < of the Ameri Council Edu- 
cati vared this sur yf the methods and 
accomplishments in the prediction of success in en- 
rine law, medicine, dentistry, music, agi 
cultur teaching, and nursing. The monograph 
will | val in vocational counseling for the 
) ate sives a concies backers 1 
tor y bout the development « elective 
, 7 } { ‘ id © - vch lag 
WA J WALLACE. y maladijust- 
ments and mental hygiene. New York: Mce- 
Graw-Hill, 1949. Pp. xiv 581. $5.00 
Wal has revised the 1935 volume of the same 


addition of 
in the number of illus- 


the 
increase 


title, but, in spite of 
the 
the 1949 edition is hardly more cur- 
the older volume. The book 
into two parts. Part I, comprising almost half the 
book, is labeled “Introduction,” and 
sents the author’s concepts of mental hygiene. Part 
II can be described best by using Wallin’s head- 


some new 
— 
material and 


cases 


trative 


rent than is divided 


merelv pre- 


ing: “Symptoms of personality maladjustment as 
evidenced by inadequate or unwholesome modes of 


response to difficulties. Specific types of faulty 
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methods of solving life’s problems, with preven- 
tive and remedial suggestions.” The point of 
view throughout is exceedingly static, and the 
treatment of material undiscriminating. Research 
results and “inspirational” literature are given 
equal weight. Illustrative “cases” are presented 
in abundance, but they consist of snatches of auto- 
biographical and anecdotal material from 95 stu- 
dent term papers. Two voluminous bibliographies 
are included: one in connection with the text, pre- 
sented in footnotes, and the other grouped by sub- 
jects and placed at the end of the book. The latter 
bibliography is excellent, but seems unrelated to the 
other. A tremendous amount of work has obvious- 
ly gone into the revision of this book, but the ne 
result is a great assortment of unrelated clinical 
items and a collection of inadeqate cases.—M.K. 


Warkins, Joun G. Hypnotherapy of war neuroses. 
New York: Ronald, 1949. Pp. x + 384. $5.0). 


The use of hypnosis in the psychotherapy of the 
traumatic neuroses of wartime raised it again to 
respectability, from the limbo of discredited tech- 
niques. The hypnotherapy of which Watkins 
writes is an accelerated analytic process, quite dif- 
ferent from the use of hypnosis in suggestion and 
reassurance. That it was an effective instrument 
with war neuroses is well demonstrated. Whether 
hypnosis will work equally well in the neuroses of 
ordinary life, with their somewhat different mo- 
tivations and less abrupt situational determinants, 
remains to be studied. The book is well and con- 
vincingly written, and contains numerous case 
studies, one of them 98 pages long, with consid- 
erable verbatim material. 


Wiiuiamson, E. G., anv Forey, J. D. Counseling 
and discipline. New York: McGraw-Hill, 1949. 
Pp. xii + 387. $3.75. 


Counselors often experience at least some degree 
of conflict in dealing with clients who have placed 
themselves in opposition to cultural standards by 
cheating, stealing, disorderly conduct, discovered 
sexual behavior, and other violations of the mores. 
From their rich experience at Minnesota, the au- 
thors give a detailed and concrete account of how 
“discipline” problems are handled by a counseling 
staff and a faculty committee without having to 
compromise therapeutic goals relating to the self- 
responsibility of the clients. Their tables and cases 
show the decline of suspension and expulsion as 
disciplinary actions, and the potentialities of gen- 
uine counseling even under the compulsion of dis- 
cipline. A 147-page appendix gives extensive rec- 
ords of fifteen cases. 


TEsTs 
BACHRACH, ARTHUR J. AND THOMPSON, CHARLES 
E. Thematic Apperception Test, Modification 
for the handicapped (Experimental set). Handi- 
capped children. Individual test. 1 form. 23 
plates, 16 new or revised, 7 from standard 
TAT; manual, pp. 6. Cleveland, Ohio: Society 

for Crippled Children, 1949. 


Following the experience with the Thompson 
modification of the TAT for Negroes, the new set 
of cards provides stimuli designed to evoke great- 
er productivity from handicapped children by pro- 
viding them with figures with which they might 
identify readily. Although two cards suggest 
handicaps of vision and hearing, most of them con- 
cern the crippled child, with canes, crutches, braces 
and wheel chairs much in evidence. The recom 
mended administration follows the standard TAT. 
The experimental edition is made available for re- 
search and development, and no data about spe- 
cial results and interpretations have as yet been 
presented. 


BeLLAK, LEOPOLD, AND BeLLAK, SonyA Sorev. Chil- 
dren’s Apperception Test (C. A. T.) Ages 3-10. 
Individual test. 1 form. 10 plates ($6.), with 
manual, pp. 13; record and analysis blank ($6. 
per 30). New York: C.P.S. Co., Box 42, Gracie 
Station, 1949. 


A most appealing thematic apperception test for 
young children, the C.A.T. presents animal char- 
acters in situations significant to the problems of 
childhood. A little bear helps one of two bigger 
bears in a tug-of-war. A lion sits with pipe and 
cane while a mouse views him from a mousehole. 
An adult spaniel spanks a young one, the back- 
ground a very obvious bathroom. Suggestive pro- 
tocols in the brief manual seem to show that the 
animal figures are sufficiently human-like to evoke 
identification, but enough unlike humans to free 
the child from inhibitions against expression. The 
analysis sheet provides for an interpretation main 
ly in terms of content: theme, hero, attitudes to 
parental figures, family roles, anxieties, conflicts, 
and the like. The C.A.T. is a ready and needed 
clinical tool, and also a provocative research in- 
strument for future studies of age, socioeconomic, 
intellectual and ethnic groups. 


Bennett, Georce K., SeAsnore, HArotp G., AND 
WesmMan, ALexanper G. Palidation of the Dif- 
ferential Aptitude Tests. Third research report, 
Dec., 1949, pp. 37. Distributed as supplement 
to the Manual (1947). New York: Psychologi- 
cal Corp., 1949. 


The third supplement to the D.A.T. manual 
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contains 1216 new validity coefficients on the pre- 
diction of subsequent academic grades, a summary 
and interpretation of validation studies, and direc- 
tions for the construction of expectancy tables from 
local norms. 


Bruce, Martin M. Aptitudes Associates Test of 
Sales Aptitude. Adult. 1 form. (30) min. Test 
booklet (10c); scoring key (20c); manual, pp. 
6 (75c). New York: Author, 624 E. 20 St., 1947, 
1950. 


A multiple choice test the 50 items of which pur- 
port to measure knowledge of general principles of 
selling, and were selected by their power to dis- 
criminate sales from nonsales groups. Reliability 
is unstated; correlation with a group intelligence 
test is zero. Percentile norms are given for 126 
salesmen, and for nonsales groups of 501 men and 
87 women. 


GuiLrorp, J. P., SHNEMAN, E., AND ZIMMERMAN, 
W. S. The Guilford-Shneidman-Zimmerman In- 
terest Survey. High school-college-adult. 1 form. 
(45) min. Test booklet (20c ea., $13.50 per 
100); answer sheet (2c); profile sheet, boys- 
men or girls-women (2c); manual, pp. 6 (25c). 
Beverly Hills, Calif.: Sheridan Supply Co., 1949. 


A 360-item interest questionnaire, based on fac- 
tor-analysis evidence, indicating interests in 9 gen- 
eral areas: artistic, linguistic, scientific, mechani- 
cal, outdoor, business-political, social activity, per- 
sonal assistance, and office work. Each of the 9 
areas is divided further into two subareas, and 
both “hobby interest” and “vocation interest” indi- 
cations are secured for each. Split-half reliabil- 
ities range from .68 to .95 and are mainly above 
.85. Validities, as is usual for interest inventories, 
are unknown. The inventory is arranged for con- 
venience in administration, and may be scored 
without a key. The profiles transmute the raw 
scores into standard scores. 


Gurtrorp, J. P., AND ZIMMERMAN, W. S. The 
Guilford-Zimmerman Temperament Survey. 
High school-college-adult. 1 form. (45) min. 
Test booklet (15¢, $10. per 100); answer sheet, 
IBM or hand scoring (3¢); profile chart, men 
or women (2¢); keys ($2.); manual, pp. 12 
(25¢). Beverly Hills, Calif.: Sheridan Supply 
Co., 1949. 


The Guilford series of personality questionnaires 
has been reduced to one form of 300 items from 
which ten traits are estimated: general activity, 
restraint, ascendance, sociability, emotional stabil- 
ity, objectivity, friendliness, thoughtfulness, per- 
sonal relations, and masculinity. Each area is rep- 
resented by 30 items, none being scored for more 


than one variable. Scoring is simplified by the use 
of weights of 0 and 1 only. Kuder-Richardson re- 
liabilities of trait scores are reported from .75 to 
.87. As the outstanding omnibus instrument based 
primarily on factor analyses, the Survey will have 
usefulness for screening, rapid evaluation and re- 
search. 


Lerrer, Russect G. Leiter Adaptation of Arthur's 
Stencil Design Test. Adult. 1 form. (30) min. 
10 designs and 19 stencils ($3); record card 
($3. per 100); qualitative check list for clinical 
use ($3. per 100). Washington: Psychological 
Service Center Press, 1949. 


This upward revision of the stencil design test 
is substantially the “shoulder patches” item of the 
Army Individual Test. As a measure of ability in 
male adults, it shows a correlation with the Re- 
vised Stanford-Binet of .62. Norms, in terms of 
MA and IQ are for adult young men only. An in- 
teresting clinical check list (by Neal Watson) calls 
attention to observations of ability to shift, syn- 
thesize, plan, and analyze, that may be useful indi- 
cators of brain injury. 


Leirer, Russet, G. Leiter Adaptation of the Paint- 
ed Cube Test. Adult. 1 form. (15) min. 24 
painted cubes ($6); 3 printed card models 
($3.); record card ($3. per 100). Washington: 
Psychological Service Center Press, 1949. 


Although some versions of the cube-assembly 
test are about 50 years old, the present compact 
and useful form was devised by Leiter in 1941, 
and served as a part of the Army Individual Test. 
The revision differs from the well-known World 
War I form by using colored pictures of cube 
assemblies as the models, instead of solid blocks. 
Data on 256 male veterans show that the test cor- 
relates .57 with the Revised Stanford-Binet, and 
that it has superior discrimination at the upper IQ 
levels, above 110. Norms are available only for 
adult young men. 


& 


Manson, Morse P. The Alcadd Test. Adult. 1 
form. (15) min. Test booklet ($2.50 per 25), 
with manual, pp. 2; specimen set (25¢). Re- 
stricted distribution. Beverly Hills, Calif.: West- 
ern Psychological Services, 1949. 


Based on an item analysis of the responses of 
alcoholics and nonalcoholics, the 65-question Al- 
cadd is highly discriminative between the two 
groups. It also yields a profile of five main be- 
havior traits of alcoholics. Suggested uses include 
the quantification of the degree of alcoholic addi- 
tion and the separation of factors giving insight 
into the dynamics of individual alcoholics. 
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WHISTLER, Harvey S., AND Tuorpe, Louis P. Mu- 
sical Aptitude Test. Gr 4-10. 1 form. Manual, 
pp. 23 ($3.), with scoring stencils; answer sheet, 
IBM or hand scoring (4¢). Los Angeles, Calif.: 
California Test Bureau, 1950. 


The new musical aptitude test is administered 
with the piano, the examiner playing from scores 
in the manual. The authors claim a decided virtue 
from this fact, since the test stimuli are signifi- 
cant musical units, in place of the psychophysical 
auditory discriminations used in the past. In addi- 
tion to a total score, three part scores are provid- 
ed; rhythm recognition, pitch recognition and dis- 
crimination (of chords), and rhythm recognition. 
Total reliability is .93, and of the parts from .80 
to .88. Correlations of total scores with teachers’ 
estimates of talent are from .56 to .37, from .78 to 
.52 when corrected for attenuation. There are per- 
centile norms for each grade from 4 to 10, based 


on a geographically distributed sample of 2,000 
cases. While no evidence is presented to demon- 
strate that the test measures “aptitude” apart from 
knowledge and achievement, it provides a con- 
venient and rational basis for the classification 
and guidance of school pupils. 


The FR-CR Test. By Staff, International Psycho- 
logical Service Center. Adult. 1 form. (5) 
min. Record Sheet ($3. per 100), with manual. 
Washington: Psychological Service Center Press, 
1949. 


A verbal memory test in which the examiner 
reads a passage aloud, followed by a free recall 
(FR), and a controlled recall (CR) by asking 7 
questions. Quantitative data lead to MA and IQ 
scores, which correlate .63 with the Revised Stan- 
ford-Binet. Qualitative observations are also sug- 
gested for diagnostic clinical use. 
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1949 DIRECTORY 
AMERICAN PSYCHOLOGICAL ASSOCIATION 


1515 MASSACHUSETTS AVENUE N. W. 
WASHINGTON 5, D.C. 


In the alphabetical list of 6785 members, the 1949 Directory of the Asso- 
ciation gives the names of the members, their addresses, their present po- 
sitions, their last degrees, and their class of membership. Membership 
lists for the Divisions of the Association, the lists of Diplomates in the 
fields of clinical, industrial, and counseling of the American Board of 
Examiners in Professional Psychology, the By-Laws, and a geographical 
and institutional index of members are included. The editor is Helen M. 
Wolfie of the Association Staff. 250 pages, $2.00. 
























Theory and Practice 


of Psychological Testing 
FRANK 8S. FREEMAN, Cornell University 


This new book presents a comprehensive description of psychological tests and a 
clear discussion of the principles upon which they are based. It covers all types 
of tests, but a special feature is the extensive treatment — descriptive, evalua- 
tive, critical — of personality inventories and projective methods, especially in 
dealing with the Rorschach Test, the Murray Thematic Apperception Test, and 
Situational Tests. The most recent experimental techniques are included, and 
the emphasis throughout is on individual and clinical interpretation of test find- 
ings. The book is illustrated with numerous tables, graphs, charts, and photo- 
graphs of actual tests. 


“|. . a very useful book. In my opinion it would be easy to study. A wise se- 
lection of illustrative material.” —Donald M. Johnson, Michigan State College 
“I have already recommended it highly. . . . I like the clarity with which Dr. 
Freeman has presented his materials.” — H. T. Manuel, The University of 


T 
die 1950 518 pages $3.50 
HENRY HOLT & CO., 257 Fourth Avenue, New York 10 



































Outstanding McGRAW-HILL Books 


HANDBOOK OF EMPLOYEE SELECTION 

By Roy M. Dorcus, University of California at Los Angeles, and Mar- 

GARET HuspBARD JONES, The State College of Washington. McGraw-Hill 

Publications in Psychology. In press 
Gathers together all the relevant information contained in many scattered refer- 
ences dealing with the selection of employees by means of scientific procedures— 
mostly tests. It covers all types of regular, civilian-paid employment, including 
factory and clerical jobs, teaching, and executive positions, The presentation is in 
the form of abstracts, containing only essential data, which are arranged chrono- 
logically. 


GENERAL CLINICAL COUNSELING. In Educational Institutions 
By Mittow E. HAHN, and Matcotm S. MacLean, University of Cali- 
fornia, Los Angeles. In press 
Collects and organizes into teachable and comprehensible form the materials per- 
tinent to the work of clinical psychologists who counsel with individuals having 
problems within the normal range of problem depth. The emphasis is on the pro- 
fessional psychologist as a counselor; the approach is in terms of functions actual- 
ly performed by the clinical psychologist. The reader is taken from the beginnings 
of professional training, through principles and tools of the counselor, into the 
nature of the problems which are his major concern. 





PHYSIOLOGICAL PSYCHOLOGY. New 2nd edition 


By Cumrozp T. Morgan, The Johns Hopkins University and ELT 

STELLAR, The Johns Hopkins University. McGraw-Hill Publications in 

Psychology. In press 
A comprehensive and authoritative survey of experimental facts in the fields of 
physiological psychology. After a historical introduction and a review of the basic 
facts of physiology and the nervous system, the book gives an extended treatment 
of the physiological basis of psychological development, sensory and motor phe- 
nomena, motivation, etc. 


INTRODUCTION TO NEUROPATHOLOGY 

By Samuet P. Hicks, M.D. and SHIELDS WARREN, M.D., Departments 

of Pathology of the Harvard Medical Schoo! and the New England Dea- 

coness Hospital. 475 pages, $10.00 
This profusely illustrated volume of the fundamentals of neuropathology presents 
a new approach to the mechanisms, dyna mic sequences, and pathologic physiology 
of disease processes in nervous tissues. The authors treat the special features of 
nervous disease together with the basic principles of the disease processes in 
general pathology. 


Send for copies on approval 


McGRAW- HILL BOOK COMPANY, Inc. 


330 West 42nd Street New York 18, N. Y. 
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